`
yangzhao
  • 浏览: 89622 次
  • 性别: Icon_minigender_1
  • 来自: 南京
社区版块
存档分类
最新评论

HTTP vs FTP

阅读更多

 

转自:http://daniel.haxx.se/docs/ftp-vs-http.html

 

FTP vs HTTP

This is my attempt to document the primary differences between FTP and HTTP, as this is commonly asked and also a lot of misconceptions (and outright lies) are flying around. If you find any errors, or have additional stuff to add, please mail me!

Both protocols are used for uploads and downloads on the internet, for text and for binary, both over TCP/IP. But there are a lot of differences in the details:

 

Age

FTP (RFC 959) appeared roughly ten years before HTTP was invented. FTP was the one and only protocol back then. The initial traces of what become RFC 959 can be found already as early as 1971 (RFC 114)

 

Upload

Both protocols offer uploads. FTP has an "append" command, where HTTP is more of a "here's data coming now you deal with it" approach.

It could be worth noticing that WebDAV is a protocol on top of HTTP that provides "filesystem-like" abilities

 

ASCII/binary

FTP has a notion of file format so it can transfer data as ASCII or binary (and more) where HTTP always sends things binary. FTP thus also allows text conversions when ASCII files are sent between systems of different sorts.

HTTP provides meta-data with files, Content-Type, which clients use but FTP has no such thing. The meta data can thus be used by clients to interpret the contents accordingly.

Headers

Transfers with HTTP always also include a set of headers that send meta data. FTP does not send such headers. When sending small files, the headers can be a significant part of the amount of actual data transfered. HTTP headers contain info about things such as last modified date, character encoding, server name and version and more.

Pipelining

HTTP supports pipelining. It means that a client can ask for the next transfer already before the previous one has ended, which thus allows multiple documents to get sent without a round-trip delay between the documents, and TCP packets are thus optimized for transfer speed.

Something related, although not similar, is FTP's support for requesting multiple files to get transferred in parallel using the same control connection. That's of course using new TCP connections for each transfer so it'll get different performance metrics.

FTP Command/Response

FTP involves the client sending commands to which the server responds. A single transfer can involve quite a series of commands. This of course has a negative impact since there's a round-trip delay for each command. HTTP transfers are primarily just one request and one response (for each document). Retrieving a single FTP file can easily get up to 10 round-trips.

Two Connections

One of the biggest hurdles about FTP in real life is its use of two connections. It uses a first primary connection to send control commands on, and when it sends or receives data, it opens a second TCP stream for that purpose.

Active and Passive

FTP opens the second connection in an active or passive mode, which basically says which end that initiates it. It's a client decision to try either way.

Firewalls and NATs

FTP's use of two connections, where the second one use dynamic port numbers and can go in either direction, gives the firewall admins grief and firewalls really have to "understand" FTP at the application protocol layer to work really well.

This also means that if both parties are behind NATs, you cannot use FTP!

Encrypted Control Connections

Since firewalls need to understand FTP to be able to open ports for the secondary connection etc, there's a huge problem with encrypted FTP (FTP-SSL or FTPS) since then the control connection is sent encrypted and the firewall(s) cannot interpret the commands that deal with creating the second connection. Also, the FTPS standard took a very long time to "hit it" so there exist a range of hybrid versions out in the wild.

Authentications

FTP and HTTP have a different set of authentication methods documented. While both protocols offer basically plain-text user and password by default, there are several commonly used authentication methods for HTTP that isn't sending the password as plain text, but there aren't as many (non-kerberos) options available for FTP.

Download

Both protocols offer support for download. Both protocols used to have problems with file sizes larger than 2GB but those are history for modern clients and servers on modern operating systems.

Ranges/resume

Both FTP and HTTP support resumed transfers in both directions, but HTTP supports more advanced byte ranges.

Persistent Connections

For HTTP communication, a client can maintain a single connection to a server and just keep using that for any amount of transfers. FTP must create a new one for each new data transfer.

Chunked Encoding

HTTP has a means of sending data which has an unknown size at the start of the transfer, but still not have to use close of the connection to signal the end of the data. This is done by "chunked encoding", which simply sends a stream of [size-of-data][data] blocks over the wire. Another obvious benefit with chunked encoding compared to plain closing of the connection is the ability to detect premature connection shutdowns.

Compression

HTTP provides a way for the client and server to negotiate and use (different) compression for the sent data. The gzip algorithm being the perhaps most compact one.

FTP offers an official "built-in" run length encoding that compresses the amount of data to send, but not by a great deal on ordinary binary data. It has also traditionally been done for FTP using various "hackish" approaches that were never in any FTP spec.

FXP

FTP supports "third party transfers", often called "FXP". It allows a client to ask a server to send data to a third host, a host that isn't the same as the client. This is often disabled in modern FTP servers though due to the security implications.

IPv6

HTTP and FTP both support ipv6 fine, but the original FTP spec had no such support and still today many FTP servers don't have support for the necessary commands that would enable it. This also goes for the firewalls in between that need to understand FTP.

Name based virtual hosting

Using HTTP, you can easily host many sites on the same server and they are all differentiated by their names. In FTP, you cannot do name based virtual hosting at all.

Dir Listing

One area in which FTP stands out somewhat is that it is a protocol that is directly on file level. It means that FTP has for example commands for listing dir contents of the remote server, while HTTP has no such concept.

However, the FTP spec authors lived in a different age so the commands for listing directory contents (LIST and NLST) don't have a specified output format so it's a pain to write programs to parse the output. Latter specs (RFC3659) have addressed this with new commands, but they are not widely implemented or supported by neither servers nor clients.

Proxy Support

One of the biggest selling points for HTTP over FTP is its support for proxies, already built-in into the protocol from day 1. The support is so successful and well used that lots of other protocols can be sent over HTTP these days just for its ability to go through proxies.

FTP has always been used over proxies as well, but that was never standardized and was always done in lots of different ad-hoc approaches.

Transfer Speed

Possibly the most common question: which is faster for transfers?

Given all the details above. What makes FTP faster:

  • No added meta-data in the sent files, just the raw binary
  • Never chunked encoding "overhead"

What makes HTTP faster:

  • reusing existing persistent connections make better TCP performance
  • pipelining makes asking for multiple files from the same server faster
  • (automatic) compression makes less data get sent
  • no command/response flow minimizes extra round-trips

Ultimately the net outcome of course differ depending on specific details, but I would say that for single-shot static files, you won't be able to measure a difference. For a single shot small file, you might get it faster with FTP (unless the server is at a long round-trip distance). When getting multiple files, HTTP should be the faster one.

Further

There are further differences, like the HTTP ability to do conditional requests, negotiate content language and much more but those are not big enough to be specified in this document.

Thanks

Feedback and improvements by: Micah Cowan, Joe Touch, Austin Appel

Updated: November 29, 2009 19:55 (Central European, Stockholm Sweden)

 

daniel AT haxx DOT se

 

分享到:
评论

相关推荐

Global site tag (gtag.js) - Google Analytics