论坛首页 综合技术论坛

Concurrency Programming 相關報告

浏览 34783 次
该帖已经被评为良好帖
作者 正文
   发表时间:2007-04-29  

一. 我會接觸Erlang的緣由
1.RFID Middleware




2.jabber (xml::stream http://zh.wikipedia.org/wiki/Jabber)

3.ejabber (http://www.process-one.net/en/ )


二. 現在的商業環境(web server)所面臨的問題
1.連線的數量不斷的攀升

2.連線的時間很長
傳統上httpd 使用Prefork的方式來解決,短時間時密集連線的問題,在現在的環境愈到了嚴重的挑戰,比如: HTTP_Streaming、Server Push、COMET 這些需要長時間連線的架構,使得httpd 能夠服務的連線變少了,而fork process 最大的問題是,他所需要佔用記憶體的空間過於龐大,於是其他的伺服器架構崛起(lighthttpd ghttpd …)

The C10K problem( http://www.kegel.com/c10k.html )
It's time for web servers to handle ten thousand clients simultaneously, don't you think? After all, the web is a big place now.
And computers are big, too. You can buy a 1000MHz machine with 2 gigabytes of RAM and an 1000Mbit/sec Ethernet card for $1200 or so. Let's see - at 20000 clients, that's 50KHz, 100Kbytes, and 50Kbits/sec per client. It shouldn't take any more horsepower than that to take four kilobytes from the disk and send them to the network once a second for each of twenty thousand clients. (That works out to $0.08 per client, by the way. Those $100/client licensing fees some operating systems charge are starting to look a little heavy!) So hardware is no longer the bottleneck???


三. Concurrency Programming
1. fork
原始的程式
(程式+資料) --fork(複製一份)(程式+資料)

當程式fork 後,child 繼承原來的資料,此後彼此不相關,如果要傳遞資訊,需要使用pipe sharememory 或是 unix socket 來做資料交換

2. thread
事實上在Linux 系統下,執行緒只是一個light weight process:Linux 核心是以fork() system call 來產生一個新的行程(process),而執行緒是以clone() system call 產生的。fork()和clone()的差別只是在clone()可以指定和父行程共用的資源有哪些,當所有資源都和父行程共用時就相當於一個執行緒了。因為Thread 的使用會讓子父行程共用資源,因此非常容易引發dead lock / race condition …這類的問題

3. lightweight Threads ( http://www.defmacro.org/ramblings/concurrency.html)
Erlang process 是一個輕量級的Thread,因此他可以非常輕易的去開啟或是結束且快速在彼此做切換,因為掀開他的底層,他只是一個簡單的function罷了,process節省了大量的context switching浪費僅在一些function上做切換的動作(Erlang 的Thread 是 vm level thread)

這份文件簡單的提到了Erlang的概觀
http://mirror.linux.org.au/pub/linux.conf.au/2007/video/talks
/252.pdf


四. Erlang ( http://www.erlang.org/ )
1.以下是 about Erlang 對他自己的簡述

Erlang is a programming language which has many features more commonly associated with an operating system than with a programming language: concurrent processes, scheduling, memory management, distribution, networking, etc.
The initial open-source Erlang release contains the implementation of Erlang, as well as a large part of Ericsson's middleware for building distributed high-availability systems.
Erlang is characterized by the following features:
Concurrency - Erlang has extremely lightweight processes whose memory requirements can vary dynamically. Processes have no shared memory and communicate by asynchronous message passing. Erlang supports applications with very large numbers of concurrent processes. No requirements for concurrency are placed on the host operating system.
Distribution - Erlang is designed to be run in a distributed environment. An Erlang virtual machine is called an Erlang node. A distributed Erlang system is a network of Erlang nodes (typically one per processor). An Erlang node can create parallel processes running on other nodes, which perhaps use other operating systems. Processes residing on different nodes communicate in exactly the same was as processes residing on the same node.
Soft real-time - Erlang supports programming "soft" real-time systems, which require response times in the order of milliseconds. Long garbage collection delays in such systems are unacceptable, so Erlang uses incremental garbage collection techniques.
Hot code upgrade - Many systems cannot be stopped for software maintenance. Erlang allows program code to be changed in a running system. Old code can be phased out and replaced by new code. During the transition, both old code and new code can coexist. It is thus possible to install bug fixes and upgrades in a running system without disturbing its operation.
Incremental code loading - Users can control in detail how code is loaded. In embedded systems, all code is usually loaded at boot time. In development systems, code is loaded when it is needed, even when the system is running. If testing uncovers bugs, only the buggy code need be replaced.
External interfaces - Erlang processes communicate with the outside world using the same message passing mechanism as used between Erlang processes. This mechanism is used for communication with the host operating system and for interaction with programs written in other languages. If required for reasons of efficiency, a special version of this concept allows e.g. C programs to be directly linked into the Erlang runtime system.

2.Erlang 語言上的概觀
書籍: ( http://pragmaticprogrammer.com/titles/jaerlang/index.html )

[ Sequential Erlang ]

Exam1:

Consider the factorial function N! defined by:
N!=N*(N-1) when N>0
N!=1 when N=0

-module(math1).
-export([fac/1]).

fac(N) when N > 0 -> N * fac(N-1);
fac(0)-> 1.

Exam2:

-module(math2).
-export([sum1/1, sum2/1]).

sum1([H | T]) -> H + sum1(T);
sum1([]) -> 0.

sum2(L) -> sum2(L, 0).
sum2([], N) -> N;
sum2([H | T], N) -> sum2(T, H+N).

[ Concurrency Programming ]

Exam3:

-module(concurrency).

-export([start/0, say /2]).

say (What, 0) ->
done;
say (What, Times) ->
io:format("~p~n", [What]),
say_something(What, Times - 1).

start() ->
spawn(tut14, say, [hello, 3]),
spawn(tut14, say, [goodbye, 3]).

Exam4:

-module(area_server).
-export([loop/0]).

loop() ->
receive
{rectangle, Width, Ht} ->
io:format("Area of rectangle is ~p~n",[Width * Ht]),
loop();
{circle, R} ->
io:format("Area of circle is ~p~n", [3.14159 * R * R]),
loop();
Other ->
io:format("I don't know what the area of a ~p is ~n",[Other]),
loop()
end.

We can create a process which evaluates loop/0 in the shell:

Pid = spawn(area_server,loop,[]).
Pid ! {rectangle, 6, 10}.
Pid ! {circle, 23}.
Pid ! {triangle,2,4,5}.


4. Erlang –style process or event-based model for actors ( http://lambda-the-ultimate.org/node/1615 )
( http://lamp.epfl.ch/~phaller/doc/haller07coord.pdf )


Message passing
Each process has its own input queue for messages it receives. New messages received are put at the end of the queue. When a process executes a receive, the first message in the queue is matched against the first pattern in the receive, if this matches, the message is removed from the queue and the actions corresponding to the the pattern are executed.
However, if the first pattern does not match, the second pattern is tested, if this matches the message is removed from the queue and the actions corresponding to the second pattern are executed. If the second pattern does not match the third is tried and so on until there are no more pattern to test. If there are no more patterns to test, the first message is kept in the queue and we try the second message instead. If this matches any pattern, the appropriate actions are executed and the second message is removed from the queue (keeping the first message and any other messages in the queue). If the second message does not match we try the third message and so on until we reach the end of the queue. If we reach the end of the queue, the process blocks (stops execution) and waits until a new message is received and this procedure is repeated.
Of course the Erlang implementation is "clever" and minimizes the number of times each message is tested against the patterns in each receive.
五. Erlang相關資源
Website:
Open Source Erlang
http://www.erlang.org
http://www.process-one.net/en/projects/

Mail List:
Erlang-questions -- Erlang/OTP discussions
http://www.erlang.org/mailman/listinfo/erlang-questions

BOOK:
Concurrent programming in Erlang
http://www.erlang.org/download/erlang-book-part1.pdf
Programming Erlang Software for a Concurrent World
http://pragmaticprogrammer.com/titles/jaerlang/index.html

 

 

MY BLOG: http://rd-program.blogspot.com

   发表时间:2007-05-21  
用erlang重写了公司一个文件上传服务器,测试性能不尽人意亚。  C++版本,单线程上传100个200K的文件,用时1.2s,erlang版本7秒。  并发20个进程,C++版本用时1.3-3.0秒之间差距比较大,erlang倒是比较平均,各13秒。  重新编译erlang,打开hipe/smp/threads选项,再测试发现性能反而下降了,测试机是4CPU,32位。  IO操作方面对于提高性能有什么好的建议吗?我是接收直接写文件的,调用的模块有:gen_tcp:recv/file:open/file:write/file:close
0 请登录后投票
   发表时间:2007-05-21  
请问你是并发写文件的吗?对于磁盘IO,并发写效率会比较差,这是我的个人体会。多个进程读,一个进程写,可能会好一些。
0 请登录后投票
   发表时间:2007-05-21  
是并发写的,目前C++版本也是并发写,性能上会有些影响,不过影响更大的是网络数据接收,目前的测试是本机进行的,网络方面影响不是很明显。  <br />
<br />
你说的一个进程写多个进程读是erlang的进程吧?收到数据后发给写进程?这样是不是要收完才能写?否则每次收到的数据是不是要连同文件名或文件句柄一起发给写进程?因为一个文件可能要多次收才能完成的。
0 请登录后投票
   发表时间:2007-05-21  
哦,我明白你的意思了。
根据你的描述,读取和写入都是同步的操作,会不会是因为同一个进程读取后再写入之间的间隔导致了比较长的延迟?能否将它们分开为不同的进程处理?
用Erlang写过一些网络程序,感觉性能还是不会差太远的,相比c
0 请登录后投票
   发表时间:2007-05-21  
哦,我记错了。

上面的测试程序中,C++版本处理时间是0.13秒,erlang的处理时间是0.7秒,各100个200K的文件。

刚才发现,编译出hipe版本以后,没有重新用erlc编译,所以测试的结果还是老的,重新编译测试了一下,只耗时0.17秒,这时和C++版本比起来已经没有太大差距了。

开10个客户进程进行测试,C++版本平均0.4秒,erlang的成绩是2.2-2.4之间,差距变大,不过基本上可以满足目前的应用了,一些性能要求比较高的地方可能还是不能满足。

我这里上传数据是用python写的客户端生成的,python测试程序会有些影响服务器性能,不过影响应该不会太大。服务器接收包头,解出文件路径,打开文件,再每8192字节接收文件内容,写入文件。我打算尝试一下多进程收,单进程写。
0 请登录后投票
   发表时间:2007-05-21  
主要还是对它不熟悉,加上-smp参数启动,10进程并发性能已经和C++接近了。15进程时两者性能相当,50进程时C++版本性能明显比erlang低得多,主要原因大概是C++版本使用的是每连接一线程。由于目前这个上传服务器并发数比较高,所以erlang可能会比C++更适合。

开发效率方面,C++版本我开发用了一整天,是在C++已经比较熟悉的情况下,而且包括许多以前积累的库,比如协议解析,线程池等;erlang版本在不熟悉的情况下边学边做,也是一整天,代码量可少多了,100行包括2种协议。
0 请登录后投票
   发表时间:2007-05-21  
:)
Erlang的网络编程是挺方便的,但是每次都自己写Socket也太原始。我现在一般用Joe在robust_server这个教程中提供的tcp_server.erl,OpenPoker里面也是用这个的。
它提供了一种回调的方式来写网络应用程序,比较方便,但是对于报文的缓存操作等还要自己做,这样的代码写多了也不爽。我的想法是写一个类似gen_server这样的东东,来抽象出回调行为,用户开发网络应用,就像使用Mina,Ace那样,实现Encoder,Decoder,MessageHandler就可以了。
0 请登录后投票
   发表时间:2007-05-21  
自己写也挺方便的,和其它语言相比。比ruby/python都还要轻松。

除了你说的这些以外,还应该和TCP/UDP/SCTP甚至更高层的HTTP等协议甚至是文件流无缝连接,其它语言都做过类似的玩意。erlang里面的behaviour我还没看明白呢。。。这些现在还完全没有概念,继续学习中。。。
0 请登录后投票
   发表时间:2007-05-22  
引用

1.連線的數量不斷的攀升

2.連線的時間很長

对于超多的TCP短连接,erlang有没有优势?不知道有没有现成的echo服务器性能测试?
0 请登录后投票
论坛首页 综合技术版

跳转论坛:
Global site tag (gtag.js) - Google Analytics