r - Breaking up a character string into multiple character strings on different lines -


i have data frame contains long character string each associated 'sample':

sample  data   1     000000000000000000000000000n01000000000000n0n000000000n00n0000nn00n0n000000100000n00n0n0000000nnnn011111111111111111111111111111110000000000000000000n000000n0000000000n   2     000000000000000000000000000n01000000000000n0n000000000n00n0000nn00n0n000000100000n00n0n0000000nnnn011111111111111111111111111111110000000000000000000n000000n0000000000n 

i code easy way break string 5 pieces in following format:

sample x cct6 - characters 1-33 gat1 - characters 34-68 imd3 - characters 69-99 pdr3 - characters 100-130 rim15 - characters 131-168 

giving output looks each sample:

sample 1 cct6 - 000000000000000000000000000n01000 gat1 - 000000000n0n000000000n00n0000nn00n0 imd3 - n000000100000n00n0n0000000nnnn0 pdr3 - 1111111111111111111111111111111 rim15 - 0000000000000000000n000000n0000000000n 

i've been able use substr function break long string individual pieces id able automate can 5 pieces in 1 output. ideally output data frame.

this ?read.fwf for.

first data looks question:

x <- data.frame(sample = c(1, 2), data = c("000000000000000000000000000n01000000000000n0n000000000n00n0000nn00n0n000000100000n00n0n0000000nnnn011111111111111111111111111111110000000000000000000n000000n0000000000n",  "000000000000000000000000000n01000000000000n0n000000000n00n0000nn00n0n000000100000n00n0n0000000nnnn011111111111111111111111111111110000000000000000000n000000n0000000000n"),  stringsasfactors = false) 

now use read.fwf, specify widths of each field , names, , should of mode character. wrap text column of example data in textconnection can treat connection understood read.* , other functions.

(strs <- read.fwf(textconnection(x$data), widths = c(33, 35, 31, 31, 38), colclasses = "character", col.names = c("cct6", "gat1", "imd3", "pdr3", "rim15")))                                  cct6                                gat1                            imd3                            pdr3                                  rim15 1 000000000000000000000000000n01000 000000000n0n000000000n00n0000nn00n0 n000000100000n00n0n0000000nnnn0 1111111111111111111111111111111 0000000000000000000n000000n0000000000n 2 000000000000000000000000000n01000 000000000n0n000000000n00n0000nn00n0 n000000100000n00n0n0000000nnnn0 1111111111111111111111111111111 0000000000000000000n000000n0000000000n 

now loop on rows , print out each 1 per example:

for (i in 1:nrow(strs)) {   writelines(paste("sample", i))   writelines(paste(names(strs), strs[i, ], sep = " - ")) } 

giving, example:

sample 2 cct6 - 000000000000000000000000000n01000 gat1 - 000000000n0n000000000n00n0000nn00n0 imd3 - n000000100000n00n0n0000000nnnn0 pdr3 - 1111111111111111111111111111111 rim15 - 0000000000000000000n000000n0000000000n 

Comments

Popular posts from this blog

objective c - Change font of selected text in UITextView -

php - Accessing POST data in Facebook cavas app -

c# - Getting control value when switching a view as part of a multiview -