- #1
michaelp7
- 6
- 0
I'm trying to analyze a fairly large (order 10^3) collection of PDB files to look for cation-pi interactions for a class. This requires me to parse each PDB file into a table which gives the position of each atom in the file. To do this, I've written the following code:
Timing[Open[AllFiles[[2]]]; (*AllFiles is a list of the filenames in the directory. I intend to replace the 2 with an iteration index when the code is working*)
Lines = Import[AllFiles[[2]], "Lines"];
FullTable = {};
Do[
LineStream = StringToStream[Lines[[j]]];
QATOM = Read[LineStream, Word];
If[QATOM == "ATOM", (*This condition looks for lines that actually describe atoms, instead of other information*)
ThisLine =
Read[LineStream, {Number, Word, Word, Word,
Number, {Number, Number, Number}}];
If[Or[StringLength[ThisLine[[3]]] == 3, StringTake[ThisLine[[3]], 1] == "A" (*This condition eliminates duplicate listings*)],
FullTable = Append[FullTable, ThisLine]]
]
, {j, 1, Length[Lines]}];
]
The code does what it's supposed to, but it slows down significantly each time I run it. The first run takes less than .2 seconds, but by the fifth run, it's already above 25 seconds to parse the same file. Quitting the kernel session solves the speed problem, but of course this deletes all my data. CleanSlate, ClearAll, and adjusting $HistoryLength all had no effect. I haven't come across a solution on this forum yet, so I would appreciate any suggestions.
Timing[Open[AllFiles[[2]]]; (*AllFiles is a list of the filenames in the directory. I intend to replace the 2 with an iteration index when the code is working*)
Lines = Import[AllFiles[[2]], "Lines"];
FullTable = {};
Do[
LineStream = StringToStream[Lines[[j]]];
QATOM = Read[LineStream, Word];
If[QATOM == "ATOM", (*This condition looks for lines that actually describe atoms, instead of other information*)
ThisLine =
Read[LineStream, {Number, Word, Word, Word,
Number, {Number, Number, Number}}];
If[Or[StringLength[ThisLine[[3]]] == 3, StringTake[ThisLine[[3]], 1] == "A" (*This condition eliminates duplicate listings*)],
FullTable = Append[FullTable, ThisLine]]
]
, {j, 1, Length[Lines]}];
]
The code does what it's supposed to, but it slows down significantly each time I run it. The first run takes less than .2 seconds, but by the fifth run, it's already above 25 seconds to parse the same file. Quitting the kernel session solves the speed problem, but of course this deletes all my data. CleanSlate, ClearAll, and adjusting $HistoryLength all had no effect. I haven't come across a solution on this forum yet, so I would appreciate any suggestions.