강의주소:https://lms.knou.ac.kr/dks/user/home/initUSTHomeIndex_GRSC.do?stLeftMenuId=0
선배블로그: https://insb.tistory.com/9?category=967351
- 작년 김성수 교수 강의로 진행

기본

R 소개

무료
대화형 프로그램 언어
객체지향 시스템
- 데이터, 변수, 행렬 등은 모두 객체
- 생성은 = or <-로 생성

x = 2:10 
x

y <- 3*x + 5
y

a <- 3:9
b <- 3*a + 5
plot(a, b)

plot(a, b, pch=18)

R의 역사

1980년대 S 언어 탄생 by AT&T Bell Lab(c언어 개발한 연구소)에서 통계언어 S를 구현함
- 1998 ACM S/W 상을 John Chambers가 탐 "Forever altered how people analyze, visualize and manipulate data
  - 사람들이 어떻게 데이터 분석, 시각화, 다루는지를 영원히 바꾼 언어다
오클란드 대학의 Ross and Robert 교수 2명이 S 축소버전인 R & R을 만듦
1995년 Martin Maechler(마틴 매흘러)가 R&R(이하카, 젠틀만)을 설득하여, linux처럼 오픈소스(GPL)로 쓰게 한 것인 R source code 발표
1997년 8월 R core team 결성. 2000년 02월 29일 R version 1.0.0 발표
- 1. 12월 R version 3.2.3

R 다운받기

base버전으로 다운 받기

작업영역 지정

getwd()

setwd(".")

setwd( choose.dir() )

Error in setwd(choose.dir()): missing value is invalid
Traceback:

1. setwd(choose.dir())

R studio 소개

rsutdio.com

R Commander 소개

메뉴 방식으로 처리된 R 패키지/ John fox가 개발

교재

First step

install.packages("ISwR")

package 'ISwR' successfully unpacked and MD5 sums checked

The downloaded binary packages are in
	C:\Users\cho_desktop\AppData\Local\Temp\Rtmpo5uUeT\downloaded_packages

library(ISwR)

Warning message:
"package 'ISwR' was built under R version 3.6.3"

plot( rnorm(1000), pch = 19)

1.1.3 벡터연산

weight <- c(60, 72, 57, 90, 95, 72)
height <- c(1.75, 1.80, 1.65, 1.90, 1.74, 1.91)
bmi <- weight/height^2

bmi

# 벡터c의 갯수 -> length( )
xbar <- sum(weight) / length(weight)
xbar

mean(weight)

sqrt(sum((weight-xbar)^2) / (length(weight) - 1))

sd(weight)

1.1.4 standard procedures

bmi

t.test(bmi, mu=22.5) # H0: mu=22.5 / H1: mu!=22.5

# t값 , 자유도, p값

	One Sample t-test

data:  bmi
t = 0.34488, df = 5, p-value = 0.7442
alternative hypothesis: true mean is not equal to 22.5
95 percent confidence interval:
 18.41734 27.84791
sample estimates:
mean of x 
 23.13262

1.1.5 Graph

# color="대or소문자" 대신 col=2번호를 줘도 된다. like pch=숫자

plot(height, weight, pch=2, col="RED")

# c벡터객체 1개 만들고
hh <- c(1.65, 1.70, 1.75, 1.80, 1.85, 1.90) 
# lines로 선 그리기
lines(hh, 22.5*hh^2, col="BLUE", lty=2) # lty = line type -> 2번은 점점점 타입

Error in plot.xy(xy.coords(x, y), type = type, ...): plot.new has not been called yet
Traceback:

1. lines(hh, 22.5 * hh^2, col = "BLUE", lty = 2)
2. lines.default(hh, 22.5 * hh^2, col = "BLUE", lty = 2)
3. plot.xy(xy.coords(x, y), type = type, ...)

plot(height, weight, pch=2, col="RED")

hh <- c(1.65, 1.70, 1.75, 1.80, 1.85, 1.90) 
lines(hh, 22.5*hh^2, col="BLUE", lty=2)

1.2 R languange essentials

# - 작따->큰따로 변환되서 표기된다.
# - 여기선 큰따-> 작따로 표기되네..like python
c("Huey", "Dewey", "Louie")

c('Huey', 'Dewey', 'Louie')

c(T, T, F, T)

bmi

# 벡터 + 조건식 -> 마스크가 된다.
bmi > 25

cat( c("Huey", "Dewey", "Louie") )

Huey Dewey Louie

cat( c("Huey", "Dewey", "Louie", "\n") )

Huey Dewey Louie

cat("What is \"R\"?\n")

What is "R"?

예제 참고자료 사이트

예제 파일을 들고 있는 블로그: https://booolean.tistory.com/913?category=807312

1.2.5 Missing Value

R에서는 결측치를 NA로 표시한다.

wd <- read.table("./data//01/wd.txt", sep = "\t", header=T) # "./따옴표경로" , sep="구분자" , header=T (첫줄이 타이틀)
wd

nwd = wd

nwd

nwd[ nwd > 0.9 ] = 99

head(nwd)

nwd[nwd == 99] = NA

head(nwd)

# is.na()는 데이터객체 전체에 대한 boolean mask(logical character)를 만든다. -> 행렬mask -> 행 or 열별로 접근해줘야한다.
is.na(nwd)

rowSums( is.na(nwd) )

colSums( is.na(nwd) )

# -> 행별로 제거하는 듯 싶다. 
# -> na.omit()의 결과 <기존 행번호>가 같이 찍히니 확인하면 된다.
na.omit(nwd)

nwd

1.2.6 Functions that create vectors

x <- c(1,2,3)
y <- c(10, 20)

c(x, y, 5) # 벡터객체와 값이 동등하게 1차원으로 연결된다.

# boolean + 숫자 -> 숫자로 type 0/1로 바뀌어서 type통일
c(FALSE, 3)

c(FALSE, "abc")

c(pi, "abc")

seq(4, 9)

oops <- c(7, 9, 13)

rep(oops, 3)

# - index순서대로 1번, 2번, 3번 반복복사
# - index별 반복회수도 벡터로 제공하면 된다.
rep(oops, c(1,2,3))

rep(oops, 1:3)

rep(1:2, c(10, 15))

1.2.7 Matrices and arrays

x <- 1:12
x

# 행렬이 된다.
dim(x) = c(3, 4)

x # 보이진 않지만, 1, 2, 3, 행    ,1 ,2 ,3 ,4 렬이 된다.

# - 특히 byrow=를 T루로 주면, ncol없이 알아서 된다.
matrix(1:12, nrow=3, byrow=T)

LETTERS[1:3]

rownames(x) <- LETTERS[1:3]

x

t(x)

# 1차원들을 -> 눕혀서 행으로 더하는 rbind

cbind(A = 1:4)

cbind(A = 1:4, B = 5:6, C = 9:12)

rbind(A=1:4,B=5:8,C=9:12)

aa <- cbind(A=1:4,B=5:8,C=9:12)
aa

# - 좌or우 똑같은 크기의 상수를 붙이려면 상수로 bind하면 된다.

cbind(1, aa)

factors(범주형 자료)

카테고리칼 변수는 인자로 만들어주는게 중요하다. 그 때 쓰는 것이 factor()

pain <- c(0, 3, 2, 2, 1)

# -> levels=에는 [요인들과 동일한 값]으로 레벨을 정해서 매칭해놓는다.
# -> 연속적인 숫자벡터를 줬어도 '0'~'3'의 문자열로 정해진다.
# -> 매칭된 요인-레벨에 대해서, 레벨이름levels을 바꿔주면 요인들도 따라 바뀐다.

fpain <- factor(pain, levels = 0:3)

fpain

# -> 원본 값들이 레벨에 따른 해당 [라벨]로 바껴버린다.
levels(fpain) <- c("none", "mild", "medium", "severe")

fpain

숫자 -> 라벨 by요인덮 바꾸기

변수 job, edu를 숫자 -> 라벨(요인)으로 바꿔보자.

insurance = read.table("./data/01//insurance.txt", sep = "\t", header=T)

head(insurance)

insurance$job

insurance = read.table("./data/01//insurance.txt", sep = "\t", header=T, 
                       na.strings = "-9")

insurance$job

# (3) 덮어쓸 labels도 순서대로 동시에 준다.
insurance$job <- factor(insurance$job, 
       levels = 1:3,
       labels = c("근로자", "사무직", "전문가")
      )

insurance$job

head(insurance)

insurance$edu2 <-
factor(insurance$edu, 
       levels = 1:5,
       labels = c("무학", "국졸", "중졸", "고졸", "대졸")
      )

insurance

1.2.9 list

객체들을 복합 오브젝트로 만들어준다.

intake.pre <- c(5260, 5470, 5640, 6180, 6390)
intake.post <- c(3910, 4220, 3885, 5160, 5645)

mylist <- list(before=intake.pre, after=intake.post)


mylist # df의 칼럼처럼 뽑아쓸 수 있게 모아준다.

mylist$before

dataframe

복합 오브젝트 중에 행렬을 만든다.
text파일을 읽을 땐, 거의 df의 행렬로 만든다고 생각하면 된다.

d <- data.frame(intake.pre, intake.post)
d

d$intake.pre

1.2.11 indexing 기능

순서대로, 혹은 일부만 벡터/list/df의 특정index(또는 열)을 여러개 뽑아쓰는 기능

intake.pre[c(1,5)]

intake.pre[- c(1,5)]

1.2.12 Conditional selection

인덱싱 처럼 대괄호를 쓰며, 인덱싱 명시 자리에 조건식을 넣는다.
- 데이터 순서를 (환자)id로 공유하는 list 데이터라면, 서로 다른 list && 상관없어 보이는 데이터를 통해 만들기 가능한` boolean 벡터를 만들어서 대괄호에 넣는다.

intake.post

intake.pre

# 1) [pre가 6000보다 컸]던 id의 환자들에 대해
# 2) post 수치는?
intake.post[ intake.pre > 6000 ]

intake.post[ intake.pre > 5500 & intake.pre <= 6200 ]

# - 비운 행or렬 = 전체
d [ d$intake.pre > 6000, ]

1.2.15 implicit loops

내장된 반복을 가진 기능들

library("ISwR")

data(thuesen)

head(thuesen, 3) # 2개의 변수가 있는 데이터 투센

# - l을 붙이면, 각 변수들이 변수명을 가진 list()로 합쳐진다. like before=,after=
lapply( thuesen, mean)

# - na.rm = T : 미싱밸류를.제거한 
lapply( thuesen, mean, na.rm = T)

# - 그냥 벡터형식으로 풀어진다.
sapply(thuesen, mean, na.rm = T)

# - 데이터를 바로 넣으면 안해준다.
# - (데이터변수 , 그룹변수, 집계)순으로 지정해줘야한다.
tapply(thuesen, mean, na.rm = T)

Error in unique.default(x, nmax = nmax): unique()는 오로지 벡터들에만 적용됩니다
Traceback:

1. tapply(thuesen, mean, na.rm = T)
2. lapply(INDEX, as.factor)
3. FUN(X[[i]], ...)
4. factor(x)
5. unique(x, nmax = nmax)
6. unique.default(x, nmax = nmax)

tapply(thuesen$blood.glucose, thuesen$short.velocity, mean, na.rm = T)

data(energy)

energy

tapply(energy$expend, energy$stature, mean, na.rm = T)

# - 1=row별 가로집계 / 2=col별 세로집계
m = matrix(rnorm(12), 4) # 2번째인자가 row수만 지정해주는 듯
m

# - 칼럼별 최소값
apply(m, 2, min)

Sorting

특정 변수에 대한 order객체를 만든 뒤 -> boolean mask자리에 order mask를 인덱싱 자리에 넣어줘야한다.

intake

typeof(intake$post)

o1 <- order(intake$post)
o1

typeof(o1)

intake$post

# -> 해당칼럼에 대한 오름차순 order객체로 정렬
intake$post[o1]

intake$pre

intake$pre[o1]

R environment

Session management

ls(): 내 워크스페이스에 존재하는 모든 객체
rm(): 데이터 객체 삭제

ls()

rm(aa, b) # rm으로 데이터객체 삭제 -> 콤마로 한번에 여러개 가능

ls()

# df객체 속 변수를 바로 쓰는 방법은?
# -> session에 df자체를 attach해버리는 것 -> 가진 칼럼들이 다 변수로 붙어진다?
# --> 함수들의 인자로 넣기 쉽게 하기 위함인듯?

head(thuesen, 3)

attach(thuesen)

ls()

blood.glucose

# - 인덱싱과 마찬가진데, 함수를 쓴다.?!

thue2 <- subset(thuesen, blood.glucose < 7)
thue2

# - 로그값을 씌운 변수를 함수로 생성하여 반환
thue3 <- transform(thuesen, log.gluc = log(blood.glucose))
head(thue3, 4)

# -> 중괄호{}를 이용하여, 입력된 df내에서 변수생성/조작/중간변수 삭제가 가능하다.
thue4 <- within(thuesen, {
            log.gluc <- log(blood.glucose) # + log씌운 칼럼 생성
            m <- mean(log.gluc) # 평균값 계산(중간변수)
            centered.log.gluc <- log.gluc - m # 중간변수를 이용해 새로운 칼럼 추가 생성
            rm(m)
        })


head(thue4)

The graphics system

2.2.1 plot layout

x <- runif(50,0,2)
y <- runif(50, 0,2)

plot( x, y, 
     main="Main title",
     sub="subtitle",
     xlab="x-label",
     ylab="y-label",
    )

# -> x, y좌표에 텍스트를 넣어준다.
text(0.6, 0.6,
    "text at (0.6, 0.6)")

Error in text.default(0.6, 0.6, "text at (0.6, 0.6)"): plot.new has not been called yet
Traceback:

1. text(0.6, 0.6, "text at (0.6, 0.6)")
2. text.default(0.6, 0.6, "text at (0.6, 0.6)")

plot( x, y, 
     main="Main title",
     sub="subtitle",
     xlab="x-label",
     ylab="y-label",
    )

text(0.6, 0.6,
    "text at (0.6, 0.6)")

abline(h=.6, v=.6, lty=2)

Error in int_abline(a = a, b = b, h = h, v = v, untf = untf, ...): plot.new has not been called yet
Traceback:

1. abline(h = 0.6, v = 0.6, lty = 2)
2. int_abline(a = a, b = b, h = h, v = v, untf = untf, ...)

plot( x, y, 
     main="Main title",
     sub="subtitle",
     xlab="x-label",
     ylab="y-label",
    )

text(0.6, 0.6,
    "text at (0.6, 0.6)")

abline(h=.6, v=.6, lty=2)

# ->  -1부터 4숫자를, 각 side에 , 0.7크기로, -1~4의 위치에 적기
for (side in 1:4) 
    mtext(-1:4, side=side, at=.7, line=-1:4)

Error in mtext(-1:4, side = side, at = 0.7, line = -1:4): plot.new has not been called yet
Traceback:

1. mtext(-1:4, side = side, at = 0.7, line = -1:4)

plot( x, y, 
     main="Main title",
     sub="subtitle",
     xlab="x-label",
     ylab="y-label",
    )

text(0.6, 0.6,
    "text at (0.6, 0.6)")

abline(h=.6, v=.6, lty=2)

for (side in 1:4) 
    # -1부터 4숫자를, 각 side에 , 0.7크기로, -1~4의 위치에 적기
    mtext(-1:4, side=side, at=.7, line=-1:4)

c(1:4)

1:4

mtext(paste("side", 1:4), side=c(1:4), line=-1, font=2)

Error in mtext(paste("side", 1:4), side = c(1:4), line = -1, font = 2): plot.new has not been called yet
Traceback:

1. mtext(paste("side", 1:4), side = c(1:4), line = -1, font = 2)

plot( x, y, 
     main="Main title",
     sub="subtitle",
     xlab="x-label",
     ylab="y-label",
    )

text(0.6, 0.6,
    "text at (0.6, 0.6)")

abline(h=.6, v=.6, lty=2)

for (side in 1:4) 
    mtext(-1:4, side=side, at=.7, line=-1:4)

mtext(paste("side", 1:4), side=c(1:4), line=-1, font=2)

2.2.2 Building a plot from pieces

각 축에 라벨 주기

plot(x, y)

# -> 흰색으로 표시만 지우고 도화지만 그려놓는다는 의미

# 라벨도 ""로 다 지우고

# axes= F -> 축도 그리지 말고 있어라.

plot(x, y, 
     type="n", 
     xlab="", ylab="",
     axes=F
     )

plot(x, y, 
     type="n", xlab="", ylab="", axes=F)

points(x, y)

# 2) axis(1) -> 1번 축(x축)만 그려라
plot(x, y, 
     type="n", xlab="", ylab="", axes=F)

points(x, y)
axis(1)

# 2) axis(1) -> 1번 축(x축)만 그려라
# 3) axis(2, at= ) -> 2번 축(좌 y축)에는 점을 직접 찍어준다.


plot(x, y, 
     type="n", xlab="", ylab="", axes=F)

points(x, y)
axis(1)
axis(2, at=seq(0.2, 1.8, 0.2))

# 2) axis(1) -> 1번 축(x축)만 그려라
# 3) axis(2, at= ) -> 2번 축(좌 y축)에는 점을 직접 찍어준다.
# 4) box() -> 4방향 축이 사각형으로 모두 그려진다.

plot(x, y, 
     type="n", xlab="", ylab="", axes=F)

points(x, y)
axis(1)
axis(2, at=seq(0.2, 1.8, 0.2))
box()

# 2) axis(1) -> 1번 축(x축)만 그려라
# 3) axis(2, at= ) -> 2번 축(좌 y축)에는 점을 직접 찍어준다.
# 4) box() -> 4방향 축이 사각형으로 모두 그려진다.
# 5) title() ->  main(위) sub(아래)  xlab, ylab -> 2개 축 라벨

plot(x, y, 
     type="n", xlab="", ylab="", axes=F)

points(x, y)
axis(1)
axis(2, at=seq(0.2, 1.8, 0.2))
box()
title(main="Main title", sub = "subttitle",
      xlab="x-label", ylab="y-label")

2.2.3 Using par

그래프 그릴 때 유용한 par문

?par

par(mfrow=c(2,2)) # 한 화면을 2x2 로 나눠서 그린다.

par(mfrow=c(1,1))

2.2.4 Combinin plots

그래프를 합치는(겹쳐그리는) 방법: 2번째 그래프에 add=T옵션 주기
- 정규분포 변수 -> rnorm(갯수)
- 정규분포 함수 -> dnorm(변수)
hist그램 그린 후 -> 정규분포를 겹쳐그려서 -> 정규분포를 따르는지 확인

# - 안주면 각각의 그림이 2개로 나눠서 각각 그려진다.


x <- rnorm(100) # 정규분포(r norm)를 따르는 놈 100개

hist(x, freq=F) # 히스토그램: freq=F 를 주면 빈도대신 비율(밀도)가 y축을 차지한다.
curve(dnorm(x), add=T) # 정규분포 함수dnorm(x) 를 curve로 그리는데, 겹쳐서 그려라 add=T

# my) 정규분포 변수 -> rnorm(갯수) 
# my) 정규분포 함수 -> dnorm(변수)

# 다시 그리고 -> add=T로 2번째 그림 그리고


h <- hist(x, plot=F)
h

$breaks
[1] -3 -2 -1  0  1  2  3  4

$counts
[1]  4  9 39 28 17  2  1

$density
[1] 0.04 0.09 0.39 0.28 0.17 0.02 0.01

$mids
[1] -2.5 -1.5 -0.5  0.5  1.5  2.5  3.5

$xname
[1] "x"

$equidist
[1] TRUE

attr(,"class")
[1] "histogram"

h <- hist(x, plot=F)

# 기존 hist plot 객체에서 y축 정보만 가져와 -> 조작후 객체로 만든다.
ylim <- range(0, h$density, dnorm(0))

# 조작된 ylim으로 새로운 hist를 그린다.
hist(x, freq=F, ylim=ylim)

curve(dnorm(x), add=T)

2.3 R programming

함수를 정의하고 -> .r 파일로 저장후 -> source() 문으로 로딩한 뒤 -> 사용

# -> 이후 파일로 저장하기 위해 스트링으로 바꿈
func_string <- "hist.with.normal <- function(x) 
{
    hist <- hist(x, plot=F)
    s <- sd(x)
    m <- mean(x)
    ylim <- range(0, h$dentisy, dnorm(0, sd=s)) 
    
    hist(x, freq=F, ylim=ylim)
    curve(dnorm(x, m, s), add=T)
}"

filename <- "./01_p44.r"

fileConn <- file(filename)

writeLines(func_string, fileConn)
close(fileConn)

source(filename)

ls()

x <- rnorm(100)

hist.with.normal(x)

2.3.1 Flow control

R의 반복문 중 3가지
1. while 문 -> 반복될 조건문
2. repeat 문 -> 탈출할 조건문 -> while의 반대
3. for 문 -> 뿌려줄 벡터 -> 그걸로 그림그리기

y <- 12345

# 1. while (조건) 식 으로 구하는 root
x <- y/2
while (abs(x*x-y) > 1e-10 ) x <- (x + y/x)/2

x

x^2

# 1. while (조건) 식 으로 구하는 root
# 2. repeat { 식  if 탈출조건 break}

repeat{
    x <- (x + y/x)/2
    if (abs(x*x-y) < 1e-10) break
}

x

x <- seq(0, 1, 0.5)

plot(x, x, ylab="y", type="l")

for ( j in 2:8 ) lines(x, x^j)

2.4 Data entry

김성수 2015 강의교안 저장된페이지
01 2015자료 데이터시각화.pdf에 코드 검색됨

text file: read.table, read.csv
excel file : read.xlsx ( package: xlsx)

install.packages("xlsx“)
library(xlsx)
drug.data = read.xlsx("c:/Rfolder/data/drug.xlsx", 1) # 1은 default인데, sheet번호다.
edit(drug.data) # R전용 데이터에디터에 가져와서 조작하고 싶을 때

변수갑 변환(recode) Method1 : 인덱싱

(범위별 매핑 -> ) recode = 변수 값 변환이라고 한다.

새 변수를 만들고 (할당으로 복사)
인덱싱을 이용해서 범위별로 값 변환

변수갑 변환(recode) Method2 : car패키지의 recode()함수

새 변수 만들어놓는 것은 똑같고
recode()함수를 이용하여 범위를 간단하게 표현할 수 있다.
- " lo:40=1; 40:60=2;60:hi=3 "
1,2,3 범주형 자료이면서 순서형 변수 순서를 ordered()를 이용해서 factor-라벨로 준다.
- 라벨을 주면 그래프에서도 나온다.

값 라벨(Value labels): 숫자 -> 라벨로 바꾸기

명목형 변수 -> 순서 없는 일반 범주factor()를 이용해서 바꾼다.
순서형 변수 -> 순서를 가진 범주ordered()를 이용해서 바꾼다.

M1	M2	M3	M4	M5	Y
0.573	0.1059	0.465	0.538	0.841	0.534
0.651	0.1356	0.527	0.545	0.887	0.535
0.606	0.1273	0.494	0.522	0.920	0.570
0.437	0.1591	0.446	0.423	0.992	0.450
0.547	0.1135	0.531	0.519	0.915	0.548
0.444	0.1628	0.429	0.411	0.984	0.431
0.489	0.1231	0.562	0.455	0.824	0.401
0.536	0.1473	0.410	0.430	0.978	0.423
0.413	0.1182	0.592	0.464	0.854	0.475
0.685	0.1564	0.631	0.564	0.914	0.486
0.664	0.1588	0.506	0.481	0.867	0.554
0.703	0.1335	0.519	0.484	0.812	0.519
0.653	0.1395	0.625	0.519	0.892	0.492
0.586	0.1114	0.505	0.565	0.889	0.517
0.534	0.1143	0.521	0.571	0.889	0.502
0.523	0.1320	0.508	0.412	0.919	0.508
0.580	0.1249	0.546	0.608	0.954	0.520
0.448	0.1028	0.522	0.534	0.918	0.506
0.417	0.1684	0.405	0.415	0.981	0.401
0.528	0.1057	0.424	0.566	0.909	0.568

M1	M2	M3	M4	M5	Y
0.573	0.1059	0.465	0.538	0.841	0.534
0.651	0.1356	0.527	0.545	0.887	0.535
0.606	0.1273	0.494	0.522	0.920	0.570
0.437	0.1591	0.446	0.423	0.992	0.450
0.547	0.1135	0.531	0.519	0.915	0.548
0.444	0.1628	0.429	0.411	0.984	0.431
0.489	0.1231	0.562	0.455	0.824	0.401
0.536	0.1473	0.410	0.430	0.978	0.423
0.413	0.1182	0.592	0.464	0.854	0.475
0.685	0.1564	0.631	0.564	0.914	0.486
0.664	0.1588	0.506	0.481	0.867	0.554
0.703	0.1335	0.519	0.484	0.812	0.519
0.653	0.1395	0.625	0.519	0.892	0.492
0.586	0.1114	0.505	0.565	0.889	0.517
0.534	0.1143	0.521	0.571	0.889	0.502
0.523	0.1320	0.508	0.412	0.919	0.508
0.580	0.1249	0.546	0.608	0.954	0.520
0.448	0.1028	0.522	0.534	0.918	0.506
0.417	0.1684	0.405	0.415	0.981	0.401
0.528	0.1057	0.424	0.566	0.909	0.568

M1	M2	M3	M4	M5	Y
0.573	0.1059	0.465	0.538	0.841	0.534
0.651	0.1356	0.527	0.545	0.887	0.535
0.606	0.1273	0.494	0.522	99.000	0.570
0.437	0.1591	0.446	0.423	99.000	0.450
0.547	0.1135	0.531	0.519	99.000	0.548
0.444	0.1628	0.429	0.411	99.000	0.431

M1	M2	M3	M4	M5	Y
0.573	0.1059	0.465	0.538	0.841	0.534
0.651	0.1356	0.527	0.545	0.887	0.535
0.606	0.1273	0.494	0.522	NA	0.570
0.437	0.1591	0.446	0.423	NA	0.450
0.547	0.1135	0.531	0.519	NA	0.548
0.444	0.1628	0.429	0.411	NA	0.431

	M1	M2	M3	M4	M5	Y
1	0.573	0.1059	0.465	0.538	0.841	0.534
2	0.651	0.1356	0.527	0.545	0.887	0.535
7	0.489	0.1231	0.562	0.455	0.824	0.401
9	0.413	0.1182	0.592	0.464	0.854	0.475
11	0.664	0.1588	0.506	0.481	0.867	0.554
12	0.703	0.1335	0.519	0.484	0.812	0.519
13	0.653	0.1395	0.625	0.519	0.892	0.492
14	0.586	0.1114	0.505	0.565	0.889	0.517
15	0.534	0.1143	0.521	0.571	0.889	0.502

M1	M2	M3	M4	M5	Y
FALSE	FALSE	FALSE	FALSE	FALSE	FALSE
FALSE	FALSE	FALSE	FALSE	FALSE	FALSE
FALSE	FALSE	FALSE	FALSE	TRUE	FALSE
FALSE	FALSE	FALSE	FALSE	TRUE	FALSE
FALSE	FALSE	FALSE	FALSE	TRUE	FALSE
FALSE	FALSE	FALSE	FALSE	TRUE	FALSE
FALSE	FALSE	FALSE	FALSE	FALSE	FALSE
FALSE	FALSE	FALSE	FALSE	TRUE	FALSE
FALSE	FALSE	FALSE	FALSE	FALSE	FALSE
FALSE	FALSE	FALSE	FALSE	TRUE	FALSE
FALSE	FALSE	FALSE	FALSE	FALSE	FALSE
FALSE	FALSE	FALSE	FALSE	FALSE	FALSE
FALSE	FALSE	FALSE	FALSE	FALSE	FALSE
FALSE	FALSE	FALSE	FALSE	FALSE	FALSE
FALSE	FALSE	FALSE	FALSE	FALSE	FALSE
FALSE	FALSE	FALSE	FALSE	TRUE	FALSE
FALSE	FALSE	FALSE	FALSE	TRUE	FALSE
FALSE	FALSE	FALSE	FALSE	TRUE	FALSE
FALSE	FALSE	FALSE	FALSE	TRUE	FALSE
FALSE	FALSE	FALSE	FALSE	TRUE	FALSE

id	sex	job	religion	edu	amount	salary
1	m	1	1	3	7.0	110
2	m	2	1	4	12.0	135
3	f	2	3	5	8.5	127
4	f	3	3	5	5.0	150
5	m	1	3	3	4.5	113
6	m	2	1	2	3.5	95

id	sex	job	religion	edu	amount	salary
1	m	근로자	1	3	7.0	110
2	m	사무직	1	4	12.0	135
3	f	사무직	3	5	8.5	127
4	f	전문가	3	5	5.0	150
5	m	근로자	3	3	4.5	113
6	m	사무직	1	2	3.5	95

id	sex	job	religion	edu	amount	salary	edu2
1	m	근로자	1	3	7.0	110	중졸
2	m	사무직	1	4	12.0	135	고졸
3	f	사무직	3	5	8.5	127	대졸
4	f	전문가	3	5	5.0	150	대졸
5	m	근로자	3	3	4.5	113	중졸
6	m	사무직	1	2	3.5	95	국졸
7	m	전문가	2	4	4.0	102	고졸
8	f	전문가	2	4	4.0	122	고졸
9	f	사무직	3	4	4.5	140	고졸
10	m	근로자	3	5	17.0	100	대졸
11	f	근로자	1	3	22.0	NA	중졸
12	m	사무직	1	2	5.5	106	국졸
13	m	전문가	2	1	4.5	130	무학
14	m	전문가	2	5	7.0	150	대졸
15	m	NA	3	4	6.0	110	고졸
16	f	근로자	3	NA	7.0	88	NA
17	m	근로자	1	4	6.0	138	고졸
18	f	사무직	1	5	5.0	110	대졸
19	m	사무직	3	3	7.0	85	중졸
20	m	전문가	3	4	9.5	110	고졸
21	m	전문가	1	4	10.0	95	고졸
22	m	전문가	2	3	12.0	88	중졸

expend	stature
9.21	obese
7.53	lean
7.48	lean
8.08	lean
8.09	lean
10.15	lean
8.40	lean
10.88	lean
6.13	lean
7.90	lean
11.51	obese
12.79	obese
7.05	lean
11.85	obese
9.97	obese
7.48	lean
8.79	obese
9.69	obese
9.68	obese
7.58	lean
9.19	obese
8.11	lean

0.9182174	-0.7425004	1.0621464
-1.0513366	0.8554100	-0.2472357
-0.9116358	-1.6340607	-0.4725558
-0.1975745	-0.1045159	1.0616053

pre	post
5260	3910
5470	4220
5640	3885
6180	5160
6390	5645
6515	4680
6805	5265
7515	5975
7515	6790
8230	6900
8770	7335

blood.glucose	short.velocity	log.gluc
15.3	1.76	2.727853
10.8	1.34	2.379546
8.1	1.27	2.091864
19.5	1.47	2.970414

blood.glucose	short.velocity	centered.log.gluc	log.gluc
15.3	1.76	0.4818798	2.727853
10.8	1.34	0.1335731	2.379546
8.1	1.27	-0.1541090	2.091864
19.5	1.47	0.7244414	2.970414
7.2	1.27	-0.2718920	1.974081
5.3	1.49	-0.5782662	1.667707

	blood.glucose	short.velocity
6	5.3	1.49
11	6.7	1.25
12	5.2	1.19
15	6.7	1.52
17	4.2	1.12
22	4.9	1.03

📜 제목으로 보기

✏마지막 댓글로

기본

R 소개

R의 역사

R 다운받기

작업영역 지정

R studio 소개

R Commander 소개

교재

First step

1.1.3 벡터연산

1.1.4 standard procedures

1.1.5 Graph

1.2 R languange essentials

예제 참고자료 사이트

1.2.5 Missing Value

1.2.6 Functions that create vectors

1.2.7 Matrices and arrays

factors(범주형 자료)

숫자 -> 라벨 by요인덮 바꾸기

1.2.9 list

dataframe

1.2.11 indexing 기능

1.2.12 Conditional selection

1.2.15 implicit loops

Sorting

R environment

Session management

The graphics system

2.2.1 plot layout

2.2.2 Building a plot from pieces

2.2.3 Using par

2.2.4 Combinin plots

2.3 R programming

2.3.1 Flow control

2.4 Data entry

변수갑 변환(recode) Method1 : 인덱싱

변수갑 변환(recode) Method2 : car패키지의 recode()함수

값 라벨(Value labels): 숫자 -> 라벨로 바꾸기

댓글 끝