Thanks Viraltux,
The assumptuion is that the model is reflected similarly in long/short sequences, i.e., short sequences teach the model about relations later on seen in longer sequences. Think about partial sequences' availability. Hence, supposedly you should not feel the difference between...