data processing using r -
i have file .ped contains several columns, , want extract informations it. here sample of data (there no header):
1 1 1 1 2 1 2 3 2 3 4 1 3 5 2 ... the first column indicates id family, second id individual, third sex of individual.
i read table dataframe
ped <- read.table("pedigree.ped", header=false) how can compute number of families exist (one family can appear more 1 time , want consider them one)? have sex column 1 designate male , 2 female, how can distribution of males , females in data set?
i'm newbie r, if can give code!
thanks in advanced.
since new r, suggest looking excel first. operations asking simple , can done in excel.
if want use r data.frame indexing, subsetting etc.
if familiar sql, in sqldf package
number of families:
numfamilies <- length(unique(ped[,1])) number of males & females:
nummales <- sum(ped[,3] == 1) numfemales <- sum(ped[,3] == 2)
Comments
Post a Comment