Frame rate
Frame rate is a measure used to measure the number of still frames of a video display. The unit of measurement is “Frame per Second” (Frame per Second, FPS) or “Hertz”. Generally speaking, FPS is used to describe how many frames of still pictures the video plays per second.
Due to the special physiological structure of the human eye, if the frame rate of the image being viewed is higher than about 10 to 12 frames per second, it will be considered to be coherent. This phenomenon is called persistence of vision. This is why the movie film is shot frame by frame, but when it is played quickly, the picture we see is continuous.
PAL (TV broadcast format in Europe, Asia, Australia and other places) and SECAM (TV broadcast format in France, Russia, some Africa and other places) stipulate that the update rate is 25fps, while NTSC (TV broadcast in the United States, Canada, Japan and other places) Format) stipulates that its update rate is 29.97 fps. Early film films were shot at a slightly slower 24fps.
Adjacent pictures in a video sequence are usually very similar, that is, they contain a lot of redundancy. We can use certain methods to eliminate this redundancy and improve the compression ratio.
Frame/Group (GOP)
Image frames are processed in groups, and this group is GOP (Group of pictures). The first frame of each group (usually the first frame) does not use motion estimation when encoding. Such frames are called Intra frames or I frames. The other frames in the group use Inter frame, which is usually P frame. This encoding method is usually called IPPPP, which means that the first frame is an I frame when encoding, and the other frames are P frames.
I frames can be used to implement fast forward, rewind, or other random access functions. When a new client starts to browse the content stream, the encoder will automatically insert I-frames at fixed time intervals or as needed. The disadvantage of the I frame is that it consumes more bits. On the other hand, it does not generate many artifacts due to lost data.
P-frames represent predictive inter-frames, using early I-frames and/or P-frames as a reference. Compared with I-frames, P-frames usually require fewer bits, but it has a disadvantage that it is sensitive to transmission errors due to its high dependence on early P-frames and/or I-frames.
When making predictions, not only can the current frame be predicted from the past frame, but also the future frame can be used to predict the current frame. Of course, when encoding, the future frame must be encoded earlier than the current frame, that is, the encoding sequence and the playback sequence are different. Usually such a current frame is predicted by using past and future I frames or P frames at the same time, which is called a bidirectional prediction frame, that is, a B frame. An example of the coding sequence of this coding method is IBBPBBPBB.
The B frame is a bi-predictive inter frame, which uses the early reference frame I frame and the future frame P frame as references. Using B frames will increase the delay.
P-frames can only refer to previous I-frames or P-frames, while B-frames can refer to previous or subsequent I-frames or P-frames.
Some network video coding products can support user-defined GOP length (some products are called I frame interval), which will determine how many P frames should be sent before sending another I frame. By reducing the frequency of I frames (longer GOP), you can lower the bit rate and reduce the video file size. However, if there is congestion on the network, the video quality may be degraded due to network packet loss.
Generally, if the unit is time, the GOP length is set to 1s or 2s, that is, there is one I frame in 25 (30) frames or one I frame in 50 (60) frames. If other units are used, the GOP or I frame interval can be set to 25/30 or 50/60.
Frame rate is a measure used to measure the number of still frames of a video display. The unit of measurement is “Frame per Second” (Frame per Second, FPS) or “Hertz”. Generally speaking, FPS is used to describe how many frames of still pictures the video plays per second.
Due to the special physiological structure of the human eye, if the frame rate of the image being viewed is higher than about 10 to 12 frames per second, it will be considered to be coherent. This phenomenon is called persistence of vision. This is why the movie film is shot frame by frame, but when it is played quickly, the picture we see is continuous.
PAL (TV broadcast format in Europe, Asia, Australia and other places) and SECAM (TV broadcast format in France, Russia, some Africa and other places) stipulate that the update rate is 25fps, while NTSC (TV broadcast in the United States, Canada, Japan and other places) Format) stipulates that its update rate is 29.97 fps. Early film films were shot at a slightly slower 24fps.
Adjacent pictures in a video sequence are usually very similar, that is, they contain a lot of redundancy. We can use certain methods to eliminate this redundancy and improve the compression ratio.
Frame/Group (GOP)
Image frames are processed in groups, and this group is GOP (Group of pictures). The first frame of each group (usually the first frame) does not use motion estimation when encoding. Such frames are called Intra frames or I frames. The other frames in the group use Inter frame, which is usually P frame. This encoding method is usually called IPPPP, which means that the first frame is an I frame when encoding, and the other frames are P frames.
I frames can be used to implement fast forward, rewind, or other random access functions. When a new client starts to browse the content stream, the encoder will automatically insert I-frames at fixed time intervals or as needed. The disadvantage of the I frame is that it consumes more bits. On the other hand, it does not generate many artifacts due to lost data.
P-frames represent predictive inter-frames, using early I-frames and/or P-frames as a reference. Compared with I-frames, P-frames usually require fewer bits, but it has a disadvantage that it is sensitive to transmission errors due to its high dependence on early P-frames and/or I-frames.
When making predictions, not only can the current frame be predicted from the past frame, but also the future frame can be used to predict the current frame. Of course, when encoding, the future frame must be encoded earlier than the current frame, that is, the encoding sequence and the playback sequence are different. Usually such a current frame is predicted by using past and future I frames or P frames at the same time, which is called a bidirectional prediction frame, that is, a B frame. An example of the coding sequence of this coding method is IBBPBBPBB.
The B frame is a bi-predictive inter frame, which uses the early reference frame I frame and the future frame P frame as references. Using B frames will increase the delay.
P-frames can only refer to previous I-frames or P-frames, while B-frames can refer to previous or subsequent I-frames or P-frames.
Some network video coding products can support user-defined GOP length (some products are called I frame interval), which will determine how many P frames should be sent before sending another I frame. By reducing the frequency of I frames (longer GOP), you can lower the bit rate and reduce the video file size. However, if there is congestion on the network, the video quality may be degraded due to network packet loss.
Generally, if the unit is time, the GOP length is set to 1s or 2s, that is, there is one I frame in 25 (30) frames or one I frame in 50 (60) frames. If other units are used, the GOP or I frame interval can be set to 25/30 or 50/60.