Hi,
I have a couple of Roassal questions regarding how to customize the plots of a genome metric known as GC skew, and how to scale the visualization to cover a (bacterial) genome scale data. I have isolated a Roassal sample code below from BioSmalltalk to show how I did an initial GC skew graphic:
| b values ds |
values := 'GGCTGCGTTCCCCTCAGTTAGCGCCTATCCTAAGCAGATCTGTAGTTAGTACTGTCTAAGCTTGTTAGACTACTCGGAACTTGCTGATATTAACCTTACCCCCGTCGAAACGCTTATTCCGCTTTGCTACTTCAAGCCCTGTAACATCTACTGTACTGACAAGGTTGCAGTAGCAATTGGCAAGGCTGTTTGGCATCTCAGATGACAGTTACCCGTGTTGCGCTCACCCGCAGCGACTCTCGGATACGTAACGCAGAAGACGTCTTCCGCGAGATTTGGGCGCGTCTGTCCACCTTCCCAGGTTGGCATTGGCAGAAGCTCTATCCGGCTTTGTTCCTCTAGCGGCTCCGCA' asDNASimpleSequence gcSkewInt.
values := #(0 1 2 1 1 2 1 2 2 2 1 0 -1 -2 -2 -3 -3 -2 -2 -2 -2 -1 -2 -1 -2 -3 -3 -3 -3 -4 -5 -5 -5 -5 -4 -5 -5 -4 -4 -4 -5 -5 -4 -4 -4 -3 -3 -3 -3 -2 -2 -2 -3 -3 -2 -2 -3 -3 -3 -3 -2 -3 -3 -3 -2 -2 -2 -2 -1 -1 -2 -2 -2 -3 -3 -4 -3 -2 -2 -2 -3 -3 -3 -2 -3 -3 -2 -2 -2 -2 -2 -2 -2 -2 -3 -4 -4 -4 -4 -5 -6 -7 -8 -9 -8 -8 -9 -8 -8 -8 -8 -9 -8 -9 -9 -9 -9 -9 -9 -10 -11 -10 -11 -11 -11 -11 -10 -11 -11 -11 -12 -12 -12 -13 -13 -13 -12 -13 -14 -15 -15 -14 -14 -14 -14 -15 -15 -15 -16 -16 -16 -17 -17 -16 -16 -16 -17 -17 -16 -16 -17 -17 -17 -16 -15 -15 -15 -14 -15 -15 -14 -14 -14 -13 -14 -14 -14 -14 -14 -13 -12 -13 -13 -13 -12 -11 -12 -12 -11 -11 -11 -11 -10 -9 -10 -10 -10 -11 -11 -12 -12 -11 -11 -11 -10 -10 -11 -11 -10 -10 -10 -10 -11 -12 -13 -12 -12 -11 -11 -11 -10 -11 -10 -11 -11 -12 -12 -13 -14 -15 -14 -15 -15 -14 -15 -14 -14 -15 -15 -16 -16 -17 -16 -15 -15 -15 -15 -16 -15 -15 -15 -15 -16 -15 -16 -16 -15 -15 -15 -14 -14 -15 -14 -14 -15 -15 -15 -16 -17 -16 -17 -16 -16 -15 -15 -15 -15 -15 -14 -13 -12 -13 -12 -13 -12 -12 -13 -13 -12 -12 -13 -14 -14 -15 -16 -16 -16 -17 -18 -19 -19 -18 -17 -17 -17 -16 -15 -16 -16 -16 -16 -15 -14 -15 -15 -14 -14 -14 -13 -14 -14 -15 -15 -15 -15 -16 -17 -16 -15 -16 -16 -16 -16 -15 -15 -15 -16 -17 -17 -18 -18 -18 -17 -18 -17 -16 -17 -17 -18 -19 -18 -19 -19).
b := RTGrapher new. b extent: 800 @ 500. ds := RTData new noDot; points: values; connectColor: Color red; yourself. b add: ds. b axisY minValue: values min; title: 'Skew'; color: Color black; noDecimal. b axisX numberOfTicks: 10; noDecimal; color: Color black; title: 'Position'. b open
1) You can see the result in the TR Morph.png attached file. In X axis, how can set up a tick every certain step value? For example, every 50 points. Right now this is 88, 176, 264, 353 and I would like to be 50, 100, 150, 200, 250, 300, 350, 400.
2) I just plotted a very short DNA sequence, however if I would like to plot GC skew for E.coli that would take hundreds of points. The following scripts takes ages to complete or it never ends. You will find attached the necessary files:
| grapher ds eColiGCSkew zipArchive |
" The original dataset " "(ZnEasy get: 'http://bioinformaticsalgorithms.com/data/realdatasets/Replication/E_coli.txt') contents asDNASimpleSequence." "'/Users/mvs/Downloads/E_coli.txt' asFileReference size." "4639675"
" GC Skew calc using BioSmalltalk " "eColiGCSkew := '/Users/mvs/Downloads/E_coli.txt' asFileReference contents asDNASimpleSequence gcSkewInt."
" GC Skew already calculated in a FUEL compressed for this example " zipArchive := ZipArchive new. [ zipArchive readFrom: 'ecoligcskew.zip' fullName; extractAllTo: '.' ] ensure: [ zipArchive close ]. eColiGCSkew := FLMaterializer materializeFromFileNamed: 'OrderedCollection_3712516797.obj'. grapher := RTGrapher new extent: 800 @ 500; yourself. ds := RTData new noDot; points: eColiGCSkew; connectColor: Color red; yourself. grapher add: ds. grapher axisY minValue: eColiGCSkew min; title: 'Skew'; color: Color black; noDecimal. grapher axisX numberOfTicks: 10; noDecimal; color: Color black; title: 'Position'. grapher open
The skew_diagram_ecoli.png shows how the expected final plot.
Cheers,
Hernán