264编码基本概念 FFMpeg的解码流程

xpp02

浏览: 1015795 次

最近访客更多访客>>

xutao2811

andylao62

u012363178

prestlhh

博主相关

博客

微博

相册

留言

关于我

文章分类

社区版块

存档分类

1、NAL、Slice与frame意思及相互关系

NAL指网络提取层，里面放一些与网络相关的信息
Slice是片的意思，264中把图像分成一帧（frame）或两场（field），而帧又可以分成一个或几个片（Slilce）；片由宏块（MB）组成。宏块是编码处理的基本单元。

2、NAL nal_unit_type中的1（非IDR图像的编码条带）、2（编码条带数据分割块A）、3（编码条带数据分割块B）、4（编码条带数据分割块C）、5（IDR图像的编码条带）种类型
与 Slice种的三种编码模式：I_slice、P_slice、B_slice
NAL nal_unit_type 里的五种类型，代表接下来数据是表示啥信息的和具体如何分块。
I_slice、P_slice、B_slice 表示I类型的片、P类型的片，B类型的片.其中I_slice为帧内预测模式编码；P_slice为单向预测编码或帧内模式；B_slice 中为双向预测或帧内模式。

3、还有frame的3种类型：I frame、P frame、 B frame之间有什么映射关系么？
I frame、P frame、 B frame关系同 I_slice、P_slice、B_slice，slice和frame区别在问题1中已经讲明白。

4、最后，NAL nal_unit_type中的6（SEI）、7（SPS）、8（PPS）属于什么帧呢？
NAL nal_unit_type 为序列参数集（SPS）、图像参数集（PPS）、增强信息（SEI）不属于啥帧的概念。表示后面的数据信息为序列参数集（SPS）、图像参数集（PPS）、增强信息（SEI）。

====================================================================================

NAL单元中首先会有一个H.264 NAL type，根据这个可以判断是啥信息。如果是
H264NT_SLICE_DPA,H264NT_SLICE_DPB,H264NT_SLICE_DPC, H264NT_SLICE_IDR视频数据相关的，里面还会有Slice head头信息，根据这个头信息，可以判断属于I-Slice（P-Slice或B-Slice），之后对于每个宏块，都会有MB head 信息，根据宏块头信息可以判断块模式。

H264就是这样以分层的方式组织信息的。不知道你理解没有。

====================================================================================

x264_encoder_encode每次会以参数送入一帧待编码的帧pic_in，函数首先会从空闲队列中取出一帧用于承载该新帧，而它的i_frame被设定为播放顺序计数，如：fenc->i_frame = h->frames.i_input++。

FFMpeg的解码流程

1. 从基础谈起
先给出几个概念，以在后面的分析中方便理解
Container:在音视频中的容器，一般指的是一种特定的文件格式，里面指明了所包含的
音视频，字幕等相关信息
Stream:这个词有些微妙，很多地方都用到，比如TCP，SVR4系统等，其实在音视频，你
可以理解为单纯的音频数据或者视频数据等
Frames:这个概念不是很好明确的表示，指的是Stream中的一个数据单元，要真正对这
个概念有所理解，可能需要看一些音视频编码解码的理论知识
Packet:是Stream的raw数据
Codec:Coded + Decoded
其实这些概念在在FFmpeg中都有很好的体现，我们在后续分析中会慢慢看到

2.解码的基本流程
我很懒，于是还是选择了从<An ffmpeg and SDL Tutorial>中的流程概述:

10 OPEN video_stream FROM video.avi
20 READ packet FROM video_stream INTO frame
30 IF frame NOT COMPLETE GOTO 20
40 DO SOMETHING WITH frame
50 GOTO 20

这就是解码的全过程，一眼看去，是不是感觉不过如此:),不过，事情有深有浅，从浅
到深，然后从深回到浅可能才是一个有意思的过程，我们的故事，就从这里开始，展开
来讲。

3.例子代码
在<An ffmpeg and SDL Tutorial 1>中，给出了一个阳春版的解码器，我们来仔细看看
阳春后面的故事，为了方便讲述，我先贴出代码：

#include <ffmpeg/avcodec.h>
#include <ffmpeg/avformat.h>

#include <stdio.h>

void SaveFrame(AVFrame *pFrame, int width, int height, int iFrame) {
FILE *pFile;
char szFilename[32];
int y;

// Open file
sprintf(szFilename, "frame%d.ppm", iFrame);
pFile=fopen(szFilename, "wb");
if(pFile==NULL)
return;

// Write header
fprintf(pFile, "P6/n%d %d/n255/n", width, height);

// Write pixel data
for(y=0; y<height; y++)
fwrite(pFrame->data[0]+y*pFrame->linesize[0], 1, width*3, pFile);

// Close file
fclose(pFile);
}

int main(int argc, char *argv[]) {
AVFormatContext *pFormatCtx;
int i, videoStream;
AVCodecContext *pCodecCtx;
AVCodec *pCodec;
AVFrame *pFrame;
AVFrame *pFrameRGB;
AVPacket packet;
int frameFinished;
int numBytes;
uint8_t *buffer;

if(argc < 2) {
printf("Please provide a movie file/n");
return -1;
}
// Register all formats and codecs
########################################
[1]
########################################
av_register_all();

// Open video file
########################################
[2]
########################################
if(av_open_input_file(&pFormatCtx, argv[1], NULL, 0, NULL)!=0)
return -1; // Couldn't open file

// Retrieve stream information
########################################
[3]
########################################
if(av_find_stream_info(pFormatCtx)<0)
return -1; // Couldn't find stream information

// Dump information about file onto standard error
dump_format(pFormatCtx, 0, argv[1], 0);

// Find the first video stream
videoStream=-1;
for(i=0; i<pFormatCtx->nb_streams; i++)
if(pFormatCtx->streams[i]->codec->codec_type==CODEC_TYPE_VIDEO) {
videoStream=i;
break;
}
if(videoStream==-1)
return -1; // Didn't find a video stream

// Get a pointer to the codec context for the video stream
pCodecCtx=pFormatCtx->streams[videoStream]->codec;

// Find the decoder for the video stream
pCodec=avcodec_find_decoder(pCodecCtx->codec_id);
if(pCodec==NULL) {
fprintf(stderr, "Unsupported codec!/n");
return -1; // Codec not found
}
// Open codec
if(avcodec_open(pCodecCtx, pCodec)<0)
return -1; // Could not open codec

// Allocate video frame
pFrame=avcodec_alloc_frame();

// Allocate an AVFrame structure
pFrameRGB=avcodec_alloc_frame();
if(pFrameRGB==NULL)
return -1;

// Determine required buffer size and allocate buffer
numBytes=avpicture_get_size(PIX_FMT_RGB24, pCodecCtx->width,
pCodecCtx->height);
buffer=(uint8_t *)av_malloc(numBytes*sizeof(uint8_t));

// Assign appropriate parts of buffer to image planes in pFrameRGB
// Note that pFrameRGB is an AVFrame, but AVFrame is a superset
// of AVPicture
avpicture_fill((AVPicture *)pFrameRGB, buffer, PIX_FMT_RGB24,
pCodecCtx->width, pCodecCtx->height);

// Read frames and save first five frames to disk
########################################
[4]
########################################
i=0;
while(av_read_frame(pFormatCtx, &packet)>=0) {
// Is this a packet from the video stream?
if(packet.stream_index==videoStream) {
// Decode video frame
avcodec_decode_video(pCodecCtx, pFrame, &frameFinished,
packet.data, packet.size);

// Did we get a video frame?
if(frameFinished) {
// Convert the image from its native format to RGB
img_convert((AVPicture *)pFrameRGB, PIX_FMT_RGB24,
(AVPicture*)pFrame, pCodecCtx->pix_fmt,
pCodecCtx->width,
pCodecCtx->height);

// Save the frame to disk
if(++i<=5)
SaveFrame(pFrameRGB, pCodecCtx->width, pCodecCtx->height,
i);
}
}

// Free the packet that was allocated by av_read_frame
av_free_packet(&packet);
}

// Free the RGB image
av_free(buffer);
av_free(pFrameRGB);

// Free the YUV frame
av_free(pFrame);

// Close the codec
avcodec_close(pCodecCtx);

// Close the video file
av_close_input_file(pFormatCtx);

return 0;
}

代码注释得很清楚，没什么过多需要讲解的，关于其中的什么YUV420，RGB，PPM等格式
，如果不理解，麻烦还是google一下，也可以参考:http://barrypopy.cublog.cn/里面
的相关文章

其实这部分代码，很好了Demo了怎么样去抓屏功能的实现，但我们得去看看魔术师在后
台的一些手法，而不只是简单的享受其表演。

4.背后的故事
真正的难度，其实就是上面的[1],[2],[3],[4],其他部分，都是数据结构之间的转换，
如果你认真看代码的话，不难理解其他部分。

[1]:没什么太多好说的，如果不明白，看我转载的关于FFmepg框架的文章

[2]:先说说里面的AVFormatContext *pFormatCtx结构，字面意思理解AVFormatContext
就是关于AVFormat(其实就是我们上面说的Container格式)的所处的Context(场景)，自
然是保存Container信息的总控结构了，后面你也可以看到，基本上所有的信息，都可
以从它出发而获取到

我们来看看av_open_input_file()都做了些什么：
[libavformat/utils.c]
int av_open_input_file(AVFormatContext **ic_ptr, const char *filename,
AVInputFormat *fmt,
int buf_size,
AVFormatParameters *ap)
{
......
if (!fmt) {
/* guess format if no file can be opened */
fmt = av_probe_input_format(pd, 0);
}

......
err = av_open_input_stream(ic_ptr, pb, filename, fmt, ap);
......
}

这样看来，只是做了两件事情：
1). 侦测容器文件格式
2). 从容器文件获取Stream的信息

这两件事情，实际上就是调用特定文件的demuxer以分离Stream的过程:

具体流程如下:

av_open_input_file
|
+---->av_probe_input_format从first_iformat中遍历注册的所有demuxer以
| 调用相应的probe函数
|
+---->av_open_input_stream调用指定demuxer的read_header函数以获取相关
流的信息ic->iformat->read_header

如果反过来再参考我转贴的关于ffmpeg框架的文章，是否清楚一些了呢:)

[3]:简单从AVFormatContext获取Stream的信息，没什么好多说的

[4]:先简单说一些ffmpeg方面的东西，从理论角度说过来，Packet可以包含frame的部
分数据，但ffmpeg为了实现上的方便，使得对于视频来说，每个Packet至少包含一
frame,对于音频也是相应处理，这是实现方面的考虑，而非协议要求.
因此，在上面的代码实际上是这样的：
从文件中读取packet，从Packet中解码相应的frame;
从帧中解码;
if(解码帧完成)
do something();

我们来看看如何获取Packet,又如何从Packet中解码frame的。

av_read_frame
|
+---->av_read_frame_internal
|
+---->av_parser_parse调用的是指定解码器的s->parser->parser_parse函数以从raw packet中重构frame

avcodec_decode_video
|
+---->avctx->codec->decode调用指定Codec的解码函数

因此，从上面的过程可以看到，实际上分为了两部分：

一部分是解复用(demuxer),然后是解码(decode)

使用的分别是：
av_open_input_file() ---->解复用

av_read_frame() |
| ---->解码
avcodec_decode_video() |

5.后面该做些什么
结合这部分和转贴的ffmepg框架的文章，应该可以基本打通解码的流程了，后面的问题则是针对具体容器格式和具体编码解码器的分析，后面我们继续

参考：
[1]. <An ffmpeg and SDL Tutorial>
http://www.dranger.com/ffmpeg/tutorial01.HTML

[2]. <FFMpeg框架代码阅读>
http://blog.csdn.NET/wstarx/archive/2007/04/20/1572393.ASPx

分享到：

2010-10-09 17:05
浏览 2186
评论(0)
查看更多

发表评论

您还没有登录,请您登录后再发表评论

最近访客更多访客>>

博主相关

文章分类

社区版块

存档分类

最新评论