Mid-week performance updates

As a small mid-week blog update, I wanted to put out my updated profiling data. Peter brought up the issue of redundant file-read passes and we discussed various ways to fix it. Now I've pushed my updated lazy parser code and included below are the profiling results in an identical test to that posted last week. tell() was previously used 14 million times, and readline() 9.5 million times, now tell() calls have been reduced by over 90% to only 1.25 million calls and readline() calls were cut in half.

   Ordered by: internal time

   ncalls  tottime  percall  cumtime  percall filename:lineno(function)
        1   19.161   19.161   74.577   74.577 InsdcIO.py:1279(_make_record_in...
  4969941   18.411    0.000   18.411    0.000 {method 'readline' of 'file' ob...
  4559813   11.647    0.000   14.985    0.000 re.py:226(_compile)
  4559812    9.643    0.000   33.604    0.000 re.py:134(match)
  4596151    9.304    0.000    9.304    0.000 {method 'match' of '_sre.SRE_Pa...
  1251386    5.360    0.000    5.360    0.000 {method 'tell' of 'file' objects}
    20320    4.647    0.000   12.717    0.001 __init__.py:953(location)
  4559819    3.335    0.000    3.335    0.000 {method 'get' of 'dict' objects}
        1    2.652    2.652   29.659   29.659 InsdcIO.py:1456(_make_feature_i...
   195641    2.148    0.000    4.221    0.000 SeqFeature.py:583(__init__)
    20320    2.137    0.000    3.149    0.000 Scanner.py:211(parse_feature)

Comments

  1. Thank you so much.we have to learning that the wonderful information.I hope to really like your contents.
    Hadoop Training | Tableau Training | Informatica Training |
    Angularjs Training | Unix Shell Scripting | Seo Training

    ReplyDelete

Post a Comment

Popular posts from this blog

Indexing XML files in Python

Fasta performance comparison

Parsing EMBL and performance testing