Proposal Statistics [message #631] |
Mon, 15 November 2004 20:26 |
markus.ullius
Messages: 8 Registered: November 2004
|
Junior Member |
|
|
Hello
Since I'm working on a statistics-export for OpenTimeTable I have the
following proposals for the STATISTIC-Data:
1)change the name of "source" to "dataSource" - as it is in TRAIN
2)add an entry "statisticType" having the values "mean", "median", ... to
have different statistics
3)maybe one could also add the field "type" as in ENTRY having the values
"stop", "pass" and so on.
4)For standard deviations and so on there would be needed some float
values but I have to think about this in detail - it's just a first idea
Example see below
Best Regards
Markus Ullius
<train trainID="6203" type="planned" dataSource="opentimetable"
dataStatus="planned">
<timetableentries>
<entry posID="WH" departure = "05:34:00" departureDay="0" type="begin">
<statistic source="opentimetable" departure="05:34:46" departureDay="0"
type="begin" statisticType="mean">
</statistic>
<statistic source="opentimetable" departure="05:34:46" departureDay="0"
type="begin" statisticType="median">
</statistic>
</entry>
<entry posID="LZ" arrival = "05:56:00" arrivalDay="0" type="end">
<statistic source="opentimetable" arrival="05:58:03" arrivalDay="0"
type="end" statisticType="mean">
</statistic>
<statistic source="opentimetable" arrival="05:58:03" arrivalDay="0"
type="end" statisticType="median">
</statistic>
</entry>
</timetableentries>
</train>
|
|
|
|
|
|
|
Re: Proposal Statistics [message #636 is a reply to message #633] |
Tue, 15 February 2005 15:00 |
andreas.voss
Messages: 1 Registered: February 2005
|
Junior Member |
|
|
Markus Ullius wrote:
>> There is just one question: Shouldn't we have some additional
>> information about the statistical data like the begin and the end date
>> of period where the resulting statistics were measured?
> When you also export real timetable data you can determine where the
> statistics are calculated from. If you only have a start- and an end-date
> what will happen if there are some missing days in between?
Hi Markus
of course, there are many aspects one MIGHT want to take care of when
modelling statistical data - to get some ideas, have a look at the work of
the SDMX (statistical data and metadata exchange, www.sdmx.org)
initiative. Certainly, not all of this is necessary within the realm of
RailML and I am not sure how far we actually need to go.
Maybe, the main concern should not be an in-depth description of the
origin of the dataset (dates, methods and tools of analysis etc.), but to
allow a way of modelling in which each timetable entry in a file CAN have
a unique identifier (e.g., in order to avoid conflicts when the data is
imported into a relational database or an application that is based on
one).
For example, if you have the mean departure time of March and the mean
departure time of calendar week 11 in the same file, an application that
is reading the file must be able to make a distinction. Whether this is
based on an somewhat arbitrary identifier ("DatasetMarkus418b" or
"MonthlyStatisticsMarch2005") or an explicit listing of all parameters
describing the dataset and its origin is - in my opinion - of minor
concern.
Regards
Andreas
|
|
|
Re: Proposal Statistics [message #637 is a reply to message #636] |
Fri, 25 February 2005 10:10 |
markus.ullius
Messages: 8 Registered: November 2004
|
Junior Member |
|
|
I think it's a good idea to have a field describing the content of
statistical data as you have mentioned. Either a manual entry or as
default text the settings of eg. OpenTimeTable (dateperiod, dayselection,
timeslot, ...) could be written in this field.
Best regards
Markus
> For example, if you have the mean departure time of March and the mean
> departure time of calendar week 11 in the same file, an application that
> is reading the file must be able to make a distinction. Whether this is
> based on an somewhat arbitrary identifier ("DatasetMarkus418b" or
> "MonthlyStatisticsMarch2005") or an explicit listing of all parameters
> describing the dataset and its origin is - in my opinion - of minor
> concern.
|
|
|