Last week I went for a ride on a rather grey day. The route was one of my
usuals (40.2km in to the Goodwill Bridge), and I shaved a minute or two off
the ride time. I was mightily disappointed to find that both ipBike and
Strava reckoned I had ridden a bare 100m. This, despite the ipBike
summary field claiming "40.270km with 347m climb in 1:43:41".
I had an incident like this happen to me earlier this year and was unable
to fix it using SNAP, so I figured it was time to bite the bullet and fix
the recorded file : particularly since the temperature, heart rate, cadence
all appeared to be correctly recorded.
My first pass attempted to make use of Tomo Krajina's gpxpy, which was
fine until I realised that that library cannot handle the TrackPointExtensions
that Garmin defined.
I then tried to make headway using minidom, but got myself tied in knots
trying to create new document nodes. I'm sure I missed something quite obvious
there but I'm not really worried. Note in passing : Lode Nachtergaele's
http://castfortwo.blogspot.com.au/2014/06/parsing-strava-gpx-file-with-python.html
was really useful, and helped with my final attempt.
My final (and successful) attempt uses lxml.etree to pull out the info I
need, skip a few points (since the rides had different elapsed times, but
somewhat dubious) and then create a new GPX document with the munged data
points.
While I've now ot a close-enough fixed up file, I'm down about 2km on the ride
total, and up about 30m on the climbing total (according to Strava). I am
quite happy with the results overall, though more than willing to accept that
my code (below) is rather fugly. Good thing I'm not integrating this to a
project gate!
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120 |
#!/usr/bin/python
#
# Copyright (c) 2015, James C. McPherson. All Rights Reserved
#
from datetime import datetime, time, date
from copy import deepcopy
from lxml import etree as ET
NSMAP = {
'gpxx': 'http://www.garmin.com/xmlschemas/GpxExtensions/v3',
None: 'http://www.topografix.com/GPX/1/1',
'gpxtpx': 'http://www.garmin.com/xmlschemas/TrackPointExtension/v1',
'xsi': 'http://www.w3.org/2001/XMLSchema-instance'
}
schemaLocation = "http://www.topografix.com/GPX/1/1"
schemaLocation += "http://www.topografix.com/GPX/1/1/gpx.xsd"
schemaLocation += "http://www.garmin.com/xmlschemas/GpxExtensions/v3"
schemaLocation += "http://www.garmin.com/xmlschemas/GpxExtensionsv3.xsd"
schemaLocation += "http://www.garmin.com/xmlschemas/TrackPointExtension/v1"
schemaLocation += "http://www.garmin.com/xmlschemas/TrackPointExtensionv1.xsd"
schemaLocation += "http://www.garmin.com/xmlschemas/GpxExtensions/v3"
schemaLocation += "http://www.garmin.com/xmlschemas/GpxExtensionsv3.xsd"
schemaLocation += "http://www.garmin.com/xmlschemas/TrackPointExtension/v1"
schemaLocation += "http://www.garmin.com/xmlschemas/TrackPointExtensionv1.xsd"
gpxns = "{http://www.topografix.com/GPX/1/1}"
extns = "{http://www.garmin.com/xmlschemas/TrackPointExtension/v1}"
reftracks = []
failtracks = []
def parseTrack(trk, stime, keep=None):
tracks = {}
for s in trk.findall("%strkseg" % gpxns):
for p in s.findall("%strkpt" % gpxns):
# latitude and longitude are attributes of the trkpt node
# but elevation is a child node in its own right
el = {}
el['lat'] = p.get("lat")
el['lon'] = p.get("lon")
el['ele'] = p.find("%sele" % gpxns).text
if keep:
el['trkpt'] = deepcopy(p)
rfc3339 = p.find("%stime" % gpxns).text
try:
t = datetime.strptime(rfc3339, '%Y-%m-%dT%H:%M:%S.%fZ')
except ValueError:
t = datetime.strptime(rfc3339, '%Y-%m-%dT%H:%M:%SZ')
sec_t = int(t.strftime("%s"))
el['time'] = rfc3339
tracks[sec_t - stime] = el
return tracks
##
# Main routine starts here.
##
rf1 = open("goodfile")
ff1 = open("dodgyfile")
rf = ET.parse(rf1)
ff = ET.parse(ff1)
rstimestr = rf.getroot().find("%smetadata" % gpxns).find("%stime" % gpxns).text
rstime = int(datetime.strptime(rstimestr, "%Y-%m-%dT%H:%M:%SZ").strftime("%s"))
fstimestr = ff.getroot().find("%smetadata" % gpxns).find("%stime" % gpxns).text
fstime = int(datetime.strptime(fstimestr, "%Y-%m-%dT%H:%M:%SZ").strftime("%s"))
for track in rf.findall("%strk" % gpxns):
reftracks.append(parseTrack(track, rstime, False))
for track in ff.findall("%strk" % gpxns):
failtracks.append(parseTrack(track, fstime, True))
# Now we need to fix node attributes in failtracks
# We're being lazy, so assume only one key for now
rpts = len(reftracks[0].keys())
fpts = len(failtracks[0].keys())
if fpts > rpts:
skipn = fpts % rpts
else:
skipn = rpts % fpts
# create a "fixed" track
ntrack = ET.Element("trk")
ntrkname = ET.SubElement(ntrack, "name")
ntrkname.text = ff.find("%strk" % gpxns).find("%sname" % gpxns).text
ntseg = ET.SubElement(ntrack, "trkseg")
for (n, v) in enumerate(failtracks[0]):
if n % skipn is 0:
continue
badn = failtracks[0][n]['trkpt']
goodn = reftracks[0][n]
badn.set('lat', goodn['lat'])
badn.set('lon', goodn['lon'])
badne = badn.find("%sele" % gpxns)
badne.text = goodn['ele']
extn = badn.find("%sextensions" % gpxns)
badn.append(extn)
ntseg.append(badn)
# write a new file....
newwf = open("fixedp.gpx", "w")
gpx = ET.Element("gpx", nsmap=NSMAP)
gpx.set("creator", "James C. McPherson")
gpx.set("version", "1.1")
gpx.set("{http://www.w3.org/2001/XMLSchema-instance}schemaLocation",
schemaLocation)
gpx.append(ntrack)
et = ET.ElementTree(gpx)
et.write(newf)
newf.close()
|