R: subset dataframe based on column entry in multiple rows -
I have a format that contains information about several genes:
chr gene area 1 100 110-bit Axon 1 120 130-bit intron 1 500 550 ball upstream, downstream 1 590 600 ball intron, upstream 1 900 9 80 dead promoter, upstream
I would like to start in the field column " I used to reduce the data to remove any lines containing genes with "Exxon" or "Promoter" I was using:
field & lt; - subset (table, area == "intron" | field == "downstream" | field == "upstream" | field == "downstream, upstream")
though it gives me Is:
chr start end gene area 1 120 130 bit intron 1 500 550 ball upstream, downstream 1 590 600 ball instrumron, upstream
What do I need :
After the GR field starts, the CRR starts 500 500 ball upstream, downstream 1 590 600 ball int Try it grepl
:
df [! Grepl ("Exon | Promoter", df $ area), #################################################### Grepl ("Exon | Promoter", df $ area), ############## Pre> It is not clear to me why you are also removing row 2 with "intron" please explain to me.
Edit:
I think now, try it instead:
temp < - df $ Gene [grepl ("Exx | Promoter", DF $ area)] DF [! DF $ gn %%%,] Start # CR # 3 1 500 550 Ball Upstream, Downstream # 4 1 590 600 Ball Intron, Upstream
Comments
Post a Comment