`

Java theory and practice: Are all stateful Web applications broken? (转)

    博客分类:
  • Java
 
阅读更多

转自: http://www.ibm.com/developerworks/library/j-jtp09238/index.html

While there are many Web frameworks in the Java™ ecosystem, they all are based, directly or indirectly, on the Servlets infrastructure. The Servlets API provides a host of useful features, including state management through the HttpSession andServletContext mechanisms, which allows the application to maintain state that persists across multiple user requests. However, some subtle (and largely unwritten) rules govern the use of shared state in Web applications, of which many applications unknowingly fall afoul. The result is that many stateful Web applications have subtle and serious flaws.

 

Scoped containers

The ServletContextHttpSession, and HttpRequest objects in the Servlet specification are referred to as scoped containers. Each of these has getAttribute() and setAttribute() methods, which store data on behalf of the application. The difference between them is the lifetime of the scoped container. For HttpRequest, the data only persists for the lifetime of the request; for HttpSession, it persists for the lifetime of a session between a user and the application; and for ServletContext, it persists for the lifetime of the application.

Because the HTTP protocol is stateless, scoped containers are tremendously useful in the construction of stateful Web applications; the servlet container takes responsibility for managing application state and data life cycle. While the specification is largely silent on the subject, the session- and application-scoped containers must also to some degree be thread-safe, because the getAttribute() and setAttribute() methods may be called at any time by different threads. (The specification does not directly mandate that these implementations be thread-safe, but the nature of the service they provide effectively requires it.)

Scoped containers also offer another potentially significant benefit to Web applications: the container can manage replication and fail-over of application state transparently to the application.

 

Sessions

session is a series of request-response exchanges between a specific user and a Web application. Users expect that Web sites will remember their authentication credentials, the contents of their shopping cart, and information entered in Web forms on previous requests, but the core HTTP protocol is stateless, meaning that all the information about a request must be stored in the request itself. So to create useful interactions with users with a duration of longer than a single request-response cycle, session state must be maintained somewhere. The servlet framework allows each request to be associated with a session and provides the HttpSession interface to act as a value store for (key, value) data items relevant to that session. Listing 1 shows a typical bit of servlet code that stores shopping cart data in the HttpSession:


Listing 1. Using HttpSession to store shopping cart information

HttpSession session = request.getSession(true);
ShoppingCart cart = (ShoppingCart)session.getAttribute("shoppingCart");
if (cart == null) {
    cart = new ShoppingCart(...);
    session.setAttribute("shoppingCart");
}        
doSomethingWith(cart);


The usage in Listing 1 is typical for servlets; the application looks to see if an object has already been placed in the session, and if not, it creates one that can be used by subsequent requests on that session. Web frameworks built atop servlets (such as JSP, JSF, SpringMVC, and so on) hide the details but essentially perform this same sort of operation on your behalf for data that is tagged as session-scoped. Unfortunately, the usage in Listing 1 is also likely to be incorrect.

 

Threading considerations

When an HTTP request arrives at the servlet container, HttpRequest and HttpResponse objects are created and passed to theservice() method of a servlet, in the context of a thread managed by the servlet container. The servlet is responsible for producing the response; the servlet maintains control of that thread until the response is complete, at which point the thread is returned to the pool of available worker threads. Servlet containers maintain no affinity between threads and sessions; the next request to come in on a given session will likely be serviced by a different thread than the current request. In fact, it is possible for multiple simultaneous requests to come in on the same session (which can happen in Web applications that use frames or AJAX techniques to fetch data from the server while the user is interacting with the page). In this case, there can be multiple simultaneous requests from the same user executing concurrently on different threads.

Most of the time, threading considerations like these are irrelevant to the Web application developer. The stateless nature of HTTP encourages that the response be a function only of data stored in the request (which is not shared with other concurrent requests) and data stored in repositories (such as databases) that already manage concurrency control. However, once a Web application stores data in a shared container like HttpSession or ServletContext, we've turned our Web application into a concurrent one, and we now have to think about thread-safety within the application.

 

While thread-safety is a term we typically use to describe code, in actuality it is about data. Specifically, thread safety is about properly coordinating access to mutable data that is accessed by multiple threads. Servlet applications are frequently thread-safe by virtue of the fact that they do not share any mutable data and therefore require no additional synchronization. But there are lots of ways that shared state can be introduced into Web applications — not only scoped containers like HttpSession andServletContext, but also static fields and instance fields of HttpServlet objects. Once a Web application wants to share data across requests, the application developer must pay attention to where that shared data is and ensure that there is sufficient coordination (synchronization) between threads when accessing the shared data to avoid threading hazards.

 

Threading risks for Web applications

When a Web application stores mutable session data such as a shopping cart in an HttpSession, it becomes possible that two requests may try to access the shopping cart at the same time. Several failure modes are possible, including:

  • An atomicity failure, where one thread is updating multiple data items and another thread reads the data while they are in an inconsistent state

  • visibility failure between a reading thread and a writing thread, where one thread modifies the cart but the other sees a stale or inconsistent state for the cart's contents

Atomicity failures

Listing 2 shows a (broken) implementation of methods for setting and retrieving the high scores in a gaming application. It uses aPlayerScore object to represent the high score, which is an ordinary JavaBean class with the properties name and score, stored in the application-scoped ServletContext. (It is assumed that, at application startup, the initial high score is installed as the highScore attribute in the ServletContext, so the getAttribute() calls will not fail.)


Listing 2. Broken scheme for storing related items in a scoped container

public PlayerScore getHighScore() {
    ServletContext ctx = getServletConfig().getServletContext();
    PlayerScore hs = (PlayerScore) ctx.getAttribute("highScore");
    PlayerScore result = new PlayerScore();
    result.setName(hs.getName());
    result.setScore(hs.getScore());
    return result;
}

public void updateHighScore(PlayerScore newScore) {
    ServletContext ctx = getServletConfig().getServletContext();
    PlayerScore hs = (PlayerScore) ctx.getAttribute("highScore");
    if (newScore.getScore() > hs.getScore()) {
        hs.setName(newScore.getName());
        hs.setScore(newScore.getScore());
    }
} 


A number of things about the code in Listing 2 are broken. The approach taken here is to store a mutable holder for the high scoring player's name and score in the ServletContext. When a new high score is reached, both the name and score must be updated.

 

Suppose the current high scoring player is Bob, with a score of 1000, and his score is beaten by Joe, with a score of 1100. Near the time at which Joe's score is being installed, another player requests the high score. The getHighScore() method will retrieve the PlayerScore object from the servlet context and fetch the name and score from it. With some unlucky timing, though, it is possible to retrieve Bob's name and Joe's score, showing Bob to have achieved a score of 1100, something that never happened. (This failure might be acceptable for a free game site, but replace "score" with "bank balance" and it seems less harmless.) This is an atomicity failure, in that two operations that are supposed to be atomic with respect to each other — fetching the name/score pair and updating the name/score pair — did not in fact execute atomically with respect to each other, and one of the threads was allowed to see the shared data in an inconsistent state.

 

Further, because the score-updating logic follows the check-then-act pattern, it is possible for two threads to "race" to update the high score, with unpredictable results. Suppose the current high score is 1000, and two players simultaneously register high scores of 1100 and 1200. With some unlucky timing, both will pass the test of "is new score higher than existing high score," and both will enter the block that updates the high score. Again, depending on timing, the outcome might be inconsistent (the name of one player and the high score of the other), or just wrong (the player scoring 1100 could overwrite the name and score of the player scoring 1200).

 

Visibility failures

More subtle than atomicity failures are visibility failures. In the absence of synchronization, if one thread writes to a variable and another thread reads that same variable, the reading thread could see stale, or out-of-date, data. Worse, it is possible for the reading thread to see up-to-date data for variable x and stale data for variable y, even if y was written before x. Visibility failures are subtle because they don't happen predictably, or even frequently, causing rare and difficult-to-debug intermittent failures. Visibility failures are created by data races — failure to properly synchronize when accessing shared variables. Programs with data races are, for all intents and purposes, broken, in that their behavior cannot be reliably predicted.

 

The Java Memory Model (JMM) defines the conditions under which a thread reading a variable is guaranteed to see the results of a write in another thread. (A full explanation of the JMM is beyond the scope of this article; see Resources.) The JMM defines an ordering on the operations of a program called happens-before. Happens-before orderings across threads are only created by synchronizing on a common lock or accessing a common volatile variable. In the absence of a happens-before ordering, the Java platform has great latitude to delay or change the order in which writes in one thread become visible to reads of that same variable in another.

 

The code in Listing 2 has visibility failures as well as atomicity failures. The updateHighScore() method retrieves theHighScore object from the ServletContext and then modifies the state of the HighScore object. The intent is for those modifications to be visible to other threads that call getHighScore(), but in the absence of a happens-before ordering between the writes to the name and score properties in updateHighScore() and the reads of those properties in other threads callinggetHighScore(), we are relying on good luck for the reading threads to see the correct values.

 

Possible solutions

While the servlet specification does not adequately describe the happens-before guarantees that a servlet container must provide, one is forced to conclude that placing an attribute in a shared scoped container (HttpSession or ServletContext) happens before another thread retrieves that same attribute. (See JCiP 4.5.1 for the reasoning behind this conclusion. All the specification says is "Multiple servlets executing request threads may have active access to a single session object at the same time. The Developer has the responsibility for synchronizing access to session resources as appropriate.")

 

The set-after-write trick

It is a commonly cited "best practice" that when updating mutable data stored in scoped session containers, one must callsetAttribute() again after modifying the data. Listing 3 shows an example of updateHighScore() rewritten to use this technique. (One of the motivations for this technique is to hint to the container that the value has been changed, so that the session or application state can be resynchronized across instances in a distributed Web application.)


Listing 3. Using the set-after-write technique to hint to the servlet container that the value has been updated

public void updateHighScore(PlayerScore newScore) {
    ServletContext ctx = getServletConfig().getServletContext();
    PlayerScore hs = (PlayerScore) ctx.getAttribute("highScore");
    if (newScore.getScore() > hs.getScore()) {
        hs.setName(newScore.getName());
        hs.setScore(newScore.getScore());
        ctx.setAttribute("highScore", hs);
    }
} 


Unfortunately, while this technique helps with the problem of efficiently replicating session and application state in clustered applications, it is not enough to fix the basic thread-safety problems in our example. It is enough to mitigate the visibility problems (that another player might never see the values updated in updateHighScore()), but it is not enough to address the multiple potential atomicity problems.

 

Piggybacking on synchronization

The set-after-write technique is able to eliminate the visibility problems because the happens-before ordering is transitive, and there is a happens-before edge between the call to setAttribute() in updateHighScore() and the call to getAttribute()in getHighScore(). Because the updates to the HighScore state happen before setAttribute(), which happens before the return from getAttribute(), which happens before the use of the state by the caller of getHighScore(), transitivity lets us conclude that the values seen by callers of getHighScore() are at least as up to date as the most recent call tosetAttribute(). This technique is called piggybacking on synchronization, because the getHighScore() andupdateHighScore() methods are able to use their knowledge of synchronization in getAttribute() and setAttribute() to provide some minimal guarantees of visibility. However, in the example as written, it is still not enough. The set-after-write technique may be useful for state replication, but it is not enough to provide thread safety.

 

Leaning on immutability

A useful technique for creating thread-safe applications is to lean on immutable data as much as possible. Listing 4 shows our high score example rewritten to use an immutable implementation of HighScore that is free of the atomicity failures that would allow a caller to see a nonexistent player/score pair, as well as the visibility failures that would prevent a caller ofgetHighScore() from seeing the most recent values written by a call to updateHighScore():


Listing 4. Using an immutable HighScore object to close most of the atomicity and visibility holes

Public class HighScore {
    public final String name;
    public final int score;

    public HighScore(String name, int score) {
        this.name = name;
        this.score = score;
    }
}

public PlayerScore getHighScore() {
    ServletContext ctx = getServletConfig().getServletContext();
    return (PlayerScore) ctx.getAttribute("highScore");
}

public void updateHighScore(PlayerScore newScore) {
    ServletContext ctx = getServletConfig().getServletContext();
    PlayerScore hs = (PlayerScore) ctx.getAttribute("highScore");
    if (newScore.score > hs.score) 
        ctx.setAttribute("highScore", newScore);
} 


The code in Listing 4 has many fewer potential failure modes. Piggybacking on the synchronization in setAttribute() andgetAttribute() guarantees visibility. The fact that only a single immutable data item is being stored eliminates the potential atomicity failure that a caller to getHighScore() could see an inconsistent update to the name/score pair.

 

Placing immutable objects in a scoped container avoids most atomicity and visibility failures; it is also safe to place effectively immutable objects in a scoped container. Effectively immutable objects are those that, while theoretically mutable, are never actually modified after being published, such as a JavaBean whose setters are never called after placing the object in anHttpSession.

 

Data placed in an HttpSession is not only accessed by the requests on that session; it may also be accessed by the container itself if the container is doing any sort of state replication.

All data placed in an HttpSession or ServletContext should be thread-safe or effectively immutable.

Effecting atomic state transitions

The code in Listing 4 still has one problem, though — the check-then-act in updateHighScore() still enables a potential race between two threads trying to update the high score. With some unlucky timing, an update could be lost. Two threads could pass the "is the new high score greater than the old one" check at the same time, causing both to call setAttribute(). Depending on timing, there is no guarantee that the higher of these two scores will win. To close this last hole, we need a means of atomically updating the score reference while guaranteeing freedom from interference. Several approaches can be used to do so.

 

Listing 5 adds synchronization to updateHighScore() to ensure that the check-then-act inherent in the update process cannot execute concurrently with another update. This approach is adequate provided that all such conditional modification logic acquire the same lock used by updateHighScore().


Listing 5. Using synchronization to close the last atomicity hole

public void updateHighScore(PlayerScore newScore) {
    ServletContext ctx = getServletConfig().getServletContext();
    PlayerScore hs = (PlayerScore) ctx.getAttribute("highScore");
    synchronized (lock) {
        if (newScore.score > hs.score) 
            ctx.setAttribute("highScore", newScore);
    }
} 


While the technique in Listing 5 works, there is an even better technique: use the AtomicReference class in thejava.util.concurrent package. This class is designed to provide atomic conditional updates through the compareAndSet()call. Listing 6 shows how to use an AtomicReference to restore this last bit of atomicity to our example. This approach is preferable to the code in Listing 5 because it is harder to accidentally violate the assumptions about how to update the high score.


Listing 6. Using an AtomicReference to close the last atomicity hole

public PlayerScore getHighScore() {
    ServletContext ctx = getServletConfig().getServletContext();
    AtomicReference<PlayerScore> holder 
        = (AtomicReference<PlayerScore>) ctx.getAttribute("highScore");
    return holder.get();
}

public void updateHighScore(PlayerScore newScore) {
    ServletContext ctx = getServletConfig().getServletContext();
    AtomicReference<PlayerScore> holder 
        = (AtomicReference<PlayerScore>) ctx.getAttribute("highScore");
    while (true) {
        HighScore old = holder.get();
        if (old.score >= newScore.score)
            break;
        else if (holder.compareAndSet(old, newScore))
            break;
    } 
} 
For mutable objects placed in scoped containers, their state transitions should be made atomic, either through synchronization or through the atomic variable classes in java.util.concurrent.

Serializing access to an HttpSession

In the examples I've given so far, I've tried to avoid the various hazards associated with accessing data in the application-wideServletContext. It is clear that careful coordination is required when accessing the ServletContext, because theServletContext is accessible from any request. Most stateful Web applications, however, lean more heavily on the session-scoped container, HttpSession. It may not be obvious how multiple simultaneous requests could happen on the same session; after all, a session is tied to a particular user and browser session, and users might not seem to request multiple pages at once. But requests on a session can overlap in applications that generate requests programmatically, such as AJAX applications.

 

Requests on a single session can indeed overlap, and this ability is unfortunate. If requests on a session could be easily serialized, nearly all the hazards described here would not be an issue when accessing shared objects in an HttpSession; serialization would prevent the atomicity failures, and piggybacking on the synchronization implicit in HttpSession would prevent the visibility failures. And serializing requests tied to a specific session is unlikely to impose any significant impact on throughput, as it is somewhat rare to have requests on a session overlap at all, and it is quite rare to have many requests on a session overlap.

 

Unfortunately, there's no option in the servlet specification to say "force requests on the same session to be serialized." However, the SpringMVC framework offers a way to ask for this, and the approach can be reimplemented in other frameworks easily. The base class for SpringMVC controllers, AbstractController, provides a boolean variable synchronizeOnSession; when this is set, it will use a lock to ensure that only one request on a session executes concurrently.

Serializing requests on an HttpSession makes many concurrency hazards go away, in a similar way that confining objects to the Event Dispatch Thread (EDT) reduces the requirement for synchronization in Swing applications.

Summary

Many stateful Web applications have significant concurrency vulnerabilities that stem from accessing mutable data stored scoped containers like HttpSession and ServletContext without adequate coordination. It is easy to mistakenly assume that the synchronization inherent in the getAttribute() and setAttribute() methods is sufficient — but it only holds true under certain circumstances, such as when the attribute is an immutable, an effectively immutable, or a thread-safe object, or when requests that might access the container are serialized.

In general, everything you place in a scoped container should be effectively immutable or thread-safe. The scoped container mechanism provided by the servlet specification was never intended to manage mutable objects that did not provide their own synchronization. The biggest offender is storing ordinary JavaBeans classes in an HttpSession. This technique is only guaranteed to work when the JavaBean is never modified after it is stored in the session.

 

分享到:
评论

相关推荐

    uniGUI 0.86.0.889

    design and debug their Delphi applications as if they are regular desktop applications and then choose one of the available options for Web deployment. Of course, with uniGUI it is possible to create...

    Java.EE.7.Essentials

    View and Delete Movies (Java API for RESTful Web Services) Add Movie (Java API for JSON Processing) Ticket Sales (Batch Applications for the Java Platform) Movie Points (Java Message Service 2) ...

    Kubernetes_in_Action

    10 StatefulSets: deploying replicated stateful applications PART 3: BEYOND THE BASICS 11 Understanding Kubernetes internals 12 Securing clusters using authentication and authorization 13 Securing ...

    web-sso单点登录源码

    从技术本身的角度分析了单点登录技术的内部机制和实现手段,并且给出Web-SSO和桌面SSO的实现、源代码和详细讲解;还从安全和性能的角度对现有的实现技术进行进一步分析,指出相应的风险和需要改进的方面。本文除了从...

    Israni_Dinesh_Bulletproofing_Stateful_Applications_on_Kubernetes.pdf

    Cloud Storage

    FMSoft_uniGUI_Complete_Professional_1.0.0.1386_RC

    A unique platform to create stateful web applications. Complete IDE support for creating projects, designing forms, frames and handling data modules. Advanced support for scripting client side ...

    Laravel开发-stateful

    Laravel开发-stateful Laravel 5的有限状态机实现

    有状态stateful与无状态stateless地址转换.docx

    有状态stateful与无状态stateless地址转换.docx

    Laravel开发-stateful-eloquent

    Laravel开发-stateful-eloquent 雄辩类的状态机

    Auto Management for Apache Kafka and Distributed Stateful System

    领英大牛分项 Auto Management for Apache Kafka and Distributed Stateful System in General

    stateful_enum:基于ActiveRecord :: Enum构建的非常简单的状态机插件

    stateful_enum是建立在ActiveRecord的内置ActiveRecord :: Enum之上的状态机gem。 安装 将此行添加到您的Rails应用程序的Gemfile中: gem 'stateful_enum' 和捆绑。 动机 您不需要抽象 stateful_enum取决于...

    java面试宝典

    java面试试题 全面 准确 带答案 coreJava部分 8 1、面向对象的特征有哪些方面? 8 2、作用域public,private,protected,以及不写时的区别? 8 3、String 是最基本的数据类型吗? 8 4、float 型float f=3.4是否正确? 8 ...

    java 面试题 总结

    JAVA相关基础知识 1、面向对象的特征有哪些方面 1.抽象: 抽象就是忽略一个主题中与当前目标无关的那些方面,以便更充分地注意与当前目标有关的方面。抽象并不打算了解全部问题,而只是选择其中的一部分,暂时不用...

    Manning.Ajax.in.Practice.Jun.2007.pdf

    The Web has always been a hotbed of innovation, and, in its short history, we’ve seen many examples of an invention being repurposed and reused in ways far beyond the intentions of the original ...

    Java面试宝典-经典

    Java web部分 85 1、Tomcat的优化经验 85 2、HTTP请求的GET与POST方式的区别 85 3、解释一下什么是servlet; 85 4、说一说Servlet的生命周期? 86 5、Servlet的基本架构 86 6、SERVLET API中forward() 与redirect()的...

    Java面试宝典2010版

    Java web部分 85 1、Tomcat的优化经验 85 2、HTTP请求的GET与POST方式的区别 85 3、解释一下什么是servlet; 85 4、说一说Servlet的生命周期? 86 5、Servlet的基本架构 86 6、SERVLET API中forward() 与redirect()的...

    java面试题大全(2012版)

    Java web部分 85 1、Tomcat的优化经验 85 2、HTTP请求的GET与POST方式的区别 85 3、解释一下什么是servlet; 85 4、说一说Servlet的生命周期? 86 5、Servlet的基本架构 86 6、SERVLET API中forward() 与redirect()的...

    spark2018欧洲峰会中关于StructuredStreaming中stateful stream processing的ppt

    spark2018欧洲峰会中关于StructuredStreaming中stateful stream processing的ppt

    JAVA面试题最全集

    一、Java基础知识 1.Java有那些基本数据类型,String是不是基本数据类型,他们有何区别。 2.字符串的操作: 写一个方法,实现字符串的反转,如:输入abc,输出cba 写一个方法,实现字符串的替换,如:输入...

    java面试题

    Java 软件工程师面试资料大整合 1 Java 面霸 1 1. int 和 Integer 有什么区别? 8 2. String 和StringBuffer的区别 8 3. 运行时异常与一般异常有何异同? 8 4. 说出ArrayList,Vector,LinkedList的存储性能和特性 8 5...

Global site tag (gtag.js) - Google Analytics