python - pandas reindex with data frame -

- April 15, 2013

I have a data frame with a multi index with three levels, for example:

  COL1 COL2 ... POS label chr1 43 str ... ...... ... strb ... ... ... 66 strc ... ... ... strb ... ... ... chr2 29 strD ... ... ... ... ... ... ... ...

and the first two With a multi-functional level of a series dataframe index:

  val serum POS number 1 43 V166 v2cr2 29v3 ... ... ...

I like to add a column with serial to DataFrame , Values v1, v2 ... for each index that matches the first two levels, such as:

  COL1 COL2 new ... CHROM POS LABEL chr1 43 stra .. ... v1 ... strb ... ... v1 ... 66 strc ... ... v2 ... strb ... ... v2 ... chr2 29 strD ... .. V3 ... ... ... ... ... ... ... ...

Note that there are no missing rows in the series, that is, There are also all in the dataframe (COOM, POS) series. I have a working solution:

  pandas.Series (variant_db.index.map (lambda i: cov_per_sample [sample] .loc [i [: 2]]), index = variant_db.index )

However, due to that lambda, it is very slow for large data (hundreds of rows) I tried very fast:

  df ['New'] = s.reindex (df.index, method = 'ffill')

But in this way DF ['New'] has many nan, which is not needed. Using method = 'befill', I get NAND at various positions, but some lines get the NAn in both cases, so the use of both does not work either.

I have to do this to use only the Library function for efficiency. Can anyone help?

You can take these very simple steps to execute your large data:

< Pre>

 df1 = pandas.DataFrame ([{'CHROM': 'chr1', 'POS': 43, 'LABEL': 'stra'}, {'CHROM': 'chr1', 'POS': 43 'Serial': 'cr 1', 'POS': 66, 'LABEL': 'strb'}, {'CHROM': 'chr1', 'POS': 66, 'LABEL': 'strc'}, '{0}': '' str ' 'POS': 43, 'Val': 'V1'}, {'Serrome': 'Crow 1', 'POS': 66, 'Val': 'V2'}, {'COROM': 'CR2 ',' POS ': 29,' VAL ':' v3 '}]), in df2.iterrows (): df1.ix [(df1 [' CHROM '] == r [' CHROM ']) & Amp; (DF1 ['POS'] == [[POS ']],' New '] = R [' val ']

or using indexed:

  df1 = pandas.DataFrame (['CHROM': 'chr1', 'POS': 43, 'LABEL': 'stra', 'COL': ''}, {'CHROM': 'chr1 ',' POS ': 43,' label ':' SRB ',' call ':' '}, {' serom ':' Crr1 ',' POS ': 66,' label ':' RRC ',' coel ':' 'Chrome': 'Crow 1', 'POS': 66, 'Labels': 'SRB', 'Call': ''}, {'COROM': 'CR2', 'POS': 2 9, Set_index (['CHROM', 'POS', 'LABEL']) df2 = pandas.DataFrame ([['CHROM': 'chr1', 'POS': 43, 'VAL': 'v1'}, { 'CHROM': 'chr1', 'POS': 66, 'VAL': 'v2'}, {'CHROM': 'chr2', 'POS': 29, 'VAL': 'v3'}].) Set_index (['CHROM', 'POS']) I for df2.iterrows (): df1.ix [(i [0], i [1]), 'new'] = r ['val']

Search This Blog

Updating

python - pandas reindex with data frame -

Comments

Post a Comment

Popular posts from this blog

apache - 504 Gateway Time-out The server didn't respond in time. How to fix it? -

c# - .net WebSocket: CloseOutputAsync vs CloseAsync -

c++ - How to properly scale qgroupbox title with stylesheet for high resolution display? -