|home articles forums media subtitles software||login|
|articles search | recommended | recent||request an article | submit an article|
|Articles » #Miscellaneous||Discuss Article |
|TeleCine/IVTC - An Explanation by Johnny_one_eye||7/20/2003|
Video and Audio Syncing Problem: Why and How.
Since the first release of Powerip in mid-1999, people have been experiencing the problem of determining the correct speed of video and audio when converting an NTSC mpeg-2 video/audio stream to any other format possible (e.g. mpeg-1, avi, asf, or divx) to get a perfect video and audio syncing.
This video and audio syncing problem is the result of an incorrect conversion of the mpeg-2 video stream (either using Powerip, mpeg2avi or any other conversion utility out there). This document is not meant to discard Squeezer or Flask, but it is in fact can be considered as a support so PERHAPS, the explanation can be applied to perfect-ize both Squeezer or Flask -- or even the AGrabber plugin.
Why? Let's see the process of transferring a 35mm film format to an NTSC video format, to see the root of this evil.
35mm Film to NTSC Video Conversion
Movie is usually made on a 35mm Film Negative. This format has a 24 FRAME per second speed. A Frame is the smallest unit of a FILM format. NTSC Video is a "field-based" format of 59.94 FIELD per second. A Field is the smallest unit in Video format. 2 Fields made up into 1 FRAME. So, this 59.94 FIELD per second equals 29.97 FRAME per second. Now we can see the difference. 1 second in FILM (24 frame) is NOT equal to 1 second in NTSC Video (29.97 frame).
To be able to "match" the speed of an NTSC Video, conversion from a FILM format to an NTSC Video format undergone a process called "2:3 pulldown" or TELECINE. This process, in its simplest term, means "to add 6 frames so that a 24 fps becomes 30fps -- which is VERY close to 29.97fps". The problem that rises when doing this TELECINE transfer, is to decide WHICH 6 FRAMES to be added - or REPEATED?
Some kind of community of film/moviemaker/videomaker/engineers created a STANDARDIZATION of this TELECINE conversion. Since a Video FRAME consist of 2 Fields, why not make the FILM format into Field first, so that the smallest unit of both formats is the same? Let's see the process:
1. 24 FRAMES becomes 48 FIELDS
Frame A becomes 2 fields: Atopfield +Abottomfield. Thus, 4 Frames becomes 8 Fields, and 24 Frames becomes 48 Fields. This "field-based" material is then TELECINED into an NTSC Video signal. As TELECINE is a STANDARDIZED conversion, we have to follow the rules of engagement ;). The rule is to do a REPEAT_FIRST_FIELD in a 2:3 sequence.
2. 4 FRAMES (8 FIELDS) becomes 5 FRAMES (10 FIELDS)
If we look closely, we can see a sequence of At Al At followed by Bl Bt then Cl Ct Cl then Dt Dl. But, since 1 FRAME consists of 2 FIELDS, then the sequence becomes AA AB BC CC DD. What we have now is a conversion from 4 SOLID frame into 5 FRAMES consisting of 3 SOLID FRAMES and 2 INTERLACED FRAMES. By INTERLACED I am referring to a FRAME that's made-up from 2 FIELDs of DIFFERENT FRAME source. The AB frame is the example.
So, 4 FRAMES becomes 5 FRAMES, thus 24 becomes..... 30, DONE! Done? Nope, not by a longshot. The NTSC Video is 29.97fps, so PLAYBACK of 30fps must be slow-down into 29.97fps, which brings us to the term DROP_FRAME.
Don't get a wrong concept of DROP_FRAME as "FRAMES being REMOVED or DROPPED". In a 30fps Video sequence, a DROP_FRAME time code counts video frames accurately in relationship to real time. DROP_FRAME time code counts each video frame, but, when that .03 finally adds up to a video frame, it skips (or drops) a number. It does not drop a film or video frame, it merely skips a number and continues counting. This allows it to keep accurate time. So if you're cutting a scene using drop frame time code, and the duration reads as, say, 30 minutes and 0 frames, then you can be assured the duration is really 30 minutes. Confusing? Well, to put it in simple term, DROP_FRAME here is in essence EQUAL a SLOWED_DOWN playback from a pure 30fps into the correct NTSC 29.97fps SPEED. In an MPEG-2 domain, this means that the 00 and 01 frames are dropped or SKIPPED from time code, at the start of each minute except minutes which are even multiples of 10.
NOW, it is DONE.
Telecine in MPEG-2 Video
In an Mpeg-2 Video, storing a 30fps frames in 1 second will create a much bigger files than storing a 24 frames. If you do your calculation, a 1 second of 24 frames is 20% SMALLER in SIZE than 1 second of 30fps. But, as we have already discussed, NTSC video should be 29.97fps. It would mean that ALL movies that's created from 35mm FILM should be TELECINED, then ENCODED to 29.97fps Mpeg-2 Video stream, right? ..... NO!
A good thing about Mpeg-2 Video is that it can contain some FLAGS or PROGRAMMING, that would tell a SOFTWARE or HARDWARE to perform a TELECINE when playing the Video. Since the INTERLACED FRAMES that made-up the 29.97fps is a REPEATED field(s), it is REDUNDANT, and TRASHABLE. Just let the FLAGS tells the player to perform the TELECINE. Really, it CAN do that ;). The benefit of this that the movie CAN be stored in its original 24 FRAME per second, and thus SAVE 20% of total filesize!.
|articles forums media subtitles software hardware search||login|
|About us | Help us | Donate | Credits | Downloads/goodies | Partners||©2000-2008 Divxstation L - Legal information +|