Monthly Archives: May 2011

wsimport 产生莫名奇妙的@XmlElementRefs

如果wsimport帮你生成的class中没有firstName, lastName, middleName等正常的成员变量名,而是生成下面这样的东西:

    @XmlElementRefs({
        @XmlElementRef(name = "middle-name", type = JAXBElement.class),
        @XmlElementRef(name = "last-name", type = JAXBElement.class),       
        @XmlElementRef(name = "first-name", type = JAXBElement.class),
    })
    protected List<JAXBElement<? extends Serializable>> content;

那原因就是:  你的WSDL中使用了重复的Element Name

"It appears to be due to the same element name of PublicationReferenceType in different namespaces"

http://forums.epo.org/open-patent-services-and-publication-server-web-service/topic815.html

Facebook’s architecture(转)

From various readings and conversations I had, my understanding of Facebook’s current architecture is:

* Web front-end written in PHP. Facebook’s HipHop [1] then converts it to C++ and compiles it using g++, thus providing a high performance templating and Web logic execution layer

* Business logic is exposed as services using Thrift [2]. Some of these services are implemented in PHP, C++ or Java depending on service requirements (some other languages are probably used…)

* Services implemented in Java don’t use any usual enterprise application server but rather use Facebook’s custom application server. At first this can look as wheel reinvented but as these services are exposed and consumed only (or mostly) using Thrift, the overhead of Tomcat, or even Jetty was probably too high with no significant added value for their need.

* Persistence is done using MySQL, Memcached [3], Facebook’s Cassandra [4], Hadoop’s HBase [5]. Memcached is used as a cache for MySQL as well as a general purpose cache. Facebook engineers admit that their use of Cassandra is currently decreasing as they now prefer HBase for its simpler consistency model and its MapReduce ability.

* Offline processing is done using Hadoop and Hive

* Data such as logging, clicks and feeds transit using Scribe [6] and are aggregating and stored in HDFS using Scribe-HDFS [7], thus allowing extended analysis using MapReduce

* BigPipe [8] is their custom technology to accelerate page rendering using a pipelining logic

* Varnish Cache [9] is used for HTTP proxying. They’ve prefered it for its high performance and efficiency [10].

* The storage of the billions of photos posted by the users is handled by Haystack, an ad-hoc storage solution developed by Facebook which brings low level optimizations and append-only writes [11].

* Facebook Messages is using its own architecture which is notably based on infrastructure sharding and dynamic cluster management. Business logic and persistence is encapsulated in so-called ‘Cell’. Each Cell handles a part of users ; new Cells can be added as popularity grows [12]. Persistence is achieved using HBase [13].

* Facebook Messages’ search engine is built with an inverted index stored in HBase [14]

* Facebook Search Engine’s implementation details are unknown as far as I know

* The typeahead search uses a custom storage and retrieval logic [15]

* Chat is based on an Epoll server developed in Erlang and accessed using Thrift [16]

About the resources provisioned for each of these components, some information and numbers are known:

* Facebook is estimated to own more than 60,000 servers [17]. Their recent datacenter in Prineville, Oregon is based on entirely self-designed hardware [18] that was recently unveiled as Open Compute Project [19].

* 300 TB of data is stored in Memcached processes [20]

* Their Hadoop and Hive cluster is made of 3000 servers with 8 cores, 32 GB RAM, 12 TB disks that is a total of 24k cores, 96 TB RAM and 36 PB disks [20]

* 100 billion hits per day, 50 billion photos, 3 trillion objects cached, 130 TB of logs per day as of july 2010 [21]

[1] HipHop for PHP: http://developers.facebook.com/blog/post/358

[2] Thrift: http://thrift.apache.org/

[3] Memcached: http://memcached.org/

[4] Cassandra: http://cassandra.apache.org/

[5] HBase: http://hbase.apache.org/

[6] Scribe: https://github.com/facebook/scribe

[7] Scribe-HDFS: http://hadoopblog.blogspot.com/2009/06/hdfs-scribe-integration.html

[8] BigPipe: http://www.facebook.com/notes/facebook-engineering/bigpipe-pipelining-web-pages-for-high-performance/389414033919

[9] Varnish Cache: http://www.varnish-cache.org/

[10] Facebook goes for Varnish: http://www.varnish-software.com/customers/facebook

[11] Needle in a haystack: efficient storage of billions of photos: http://www.facebook.com/note.php?note_id=76191543919

[12] Scaling the Messages Application Back End: http://www.facebook.com/note.php?note_id=10150148835363920

[13] The Underlying Technology of Messages: https://www.facebook.com/note.php?note_id=454991608919

[14] The Underlying Technology of Messages Tech Talk: http://www.facebook.com/video/video.php?v=690851516105

[15] Facebook’s typeahead search architecture: http://www.facebook.com/video/video.php?v=432864835468

[16] Facebook Chat: http://www.facebook.com/note.php?note_id=14218138919

[17] Who has the most Web Servers?: http://www.datacenterknowledge.com/archives/2009/05/14/whos-got-the-most-web-servers/

[18] Building Efficient Data Centers with the Open Compute Project: http://www.facebook.com/note.php?note_id=10150144039563920

[19] Open Compute Project: http://opencompute.org/

[20] Facebook’s architecture presentation at Devoxx 2010: http://www.devoxx.com

[21] Scaling Facebook to 500 millions users and beyond: http://www.facebook.com/note.php?note_id=409881258919

查看JDK附带的jax-ws版本

执行“wsimport -version” 即可

结果类似于这个样子:“JAX-WS RI 2.1.6 in JDK 6”

你可能想知道RI是什么意思。看下文:

The Reference Implementation of JAX-WS is developed as an open source project and is part of project GlassFish, an open source Java EE application server. It is called JAX-WS RI.

Java API for XML Web Services (jax-ws)的正式定义

官方网站没有定义,还得找wikipedia帮忙

http://en.wikipedia.org/wiki/Java_API_for_XML_Web_Services

From Wikipedia, the free encyclopedia

The Java API for XML Web Services (JAX-WS) is a Java programming language API for creating web services. It is part of the Java EE platform from Sun Microsystems. Like the other Java EE APIs, JAX-WS uses annotations, introduced in Java SE 5, to simplify the development and deployment of web service clients and endpoints. It is part of the Java Web Services Development Pack.

The Reference Implementation of JAX-WS is developed as an open source project and is part of project GlassFish, an open source Java EE application server. It is called JAX-WS RI (For Reference Implementation) and is said to be production quality implementation (contrary to the former Reference Implementation being a proof of concept). This Reference Implementation is now part of the Metro distribution[1].

JAX-WS also is one of the foundations of WSIT.

[edit] Name change

JAX-WS 2.0 replaced the JAX-RPC API in Java Platform, Enterprise Edition 5. The name change reflected the move away from RPC-style and toward document-style web services.

如果两个SOAP Web Service共享对象,要注意naming space问题

假设你有两个基于SOAP的web service, 一个叫  FooService, 另一个叫BarService

且这两个Service都使用 HelloBean作为web method的参数或方法值

而且你还希望你的客户只使用一套stub,也就是说客户端只有一个HelloBean.class

那么你就要让这两个web service使用同样的命名空间。否则,在下列这种情况下你会遇到异常:

    
    HelloBean bean = fooService.getHelloBean();  
    barService.saveHelloBean(bean);   //JAXB会报命名空间异常,因为fooService产生的bean跟saveHelloBean()所需的bean在命名空间上不一样。

   

要在tomcat前面放个apache吗?

转自

http://wiki.apache.org/tomcat/FAQ/Connectors#Q3

Why should I integrate Apache with Tomcat? (or not)

There are many reasons to integrate Tomcat with Apache. And there are reasons why it should not be done too. Needless to say, everyone will disagree with the opinions here. With the performance of Tomcat 5 and 6, performance reasons become harder to justify. So here are the issues to discuss in integrating vs not.

    * Clustering. By using Apache as a front end you can let Apache act as a front door to your content to multiple Tomcat instances. If one of your Tomcats fails, Apache ignores it and your Sysadmin can sleep through the night. This point could be ignored if you use a hardware loadbalancer and Tomcat’s clustering capabilities.

    * Clustering/Security. You can also use Apache as a front door to different Tomcats for different URL namespaces (/app1/, /app2/, /app3/, or virtual hosts). The Tomcats can then be each in a protected area and from a security point of view, you only need to worry about the Apache server. Essentially, Apache becomes a smart proxy server.

    * Security. This topic can sway one either way. Java has the security manager while Apache has a larger mindshare and more tricks with respect to security. I won’t go into this in more detail, but let Google be your friend. Depending on your scenario, one might be better than the other. But also keep in mind, if you run Apache with Tomcat – you have two systems to defend, not one.

    * Add-ons. Adding on CGI, perl, PHP is very natural to Apache. Its slower and more of a kludge for Tomcat. Apache also has hundreds of modules that can be plugged in at will. Tomcat can have this ability, but the code hasn’t been written yet.

    * Decorators! With Apache in front of Tomcat, you can perform any number of decorators that Tomcat doesn’t support or doesn’t have the immediate code support. For example, mod_headers, mod_rewrite, and mod_alias could be written for Tomcat, but why reinvent the wheel when Apache has done it so well?

    *

      Speed. Apache is faster at serving static content than Tomcat. But unless you have a high traffic site, this point is useless. But in some scenarios, tomcat can be faster than Apache httpd. So benchmark YOUR site. Tomcat can perform at httpd speeds when using the proper connector (APR with sendFile enabled). Speed should not be considered a factor when choosing between Apache httpd and Tomcat

    * Socket handling/system stability. Apache has better socket handling with respect to error conditions than Tomcat. The main reason is Tomcat must perform all its socket handling via the JVM which needs to be cross platform. The problem is socket optimization is a platform specific ordeal. Most of the time the java code is fine, but when you are also bombarded with dropped connections, invalid packets, invalid requests from invalid IP’s, Apache does a better job at dropping these error conditions than JVM based program. (YMMV)

Facebook 的API

初略地研究了一下Facebook的API

基本上,你可以

   1. 用JS从客户端调这个API, 也可以用php, java从服务器端调用API

   2. API可以返回json结构的数据,也可以直接返回html. 两种格式都可以为Mash-up服务

   3. 可以用“iframe包含”方式来mash-up facebook的一个页面,也可以嵌入facebook的自定义标签来调用API

官方给的例子:

   1.客户端直接使用 iframe

     <body>
       <iframe src="http://www.facebook.com/plugins/like.php?href=YOUR_URL"><iframe>
    </body>
    

   2. 在客户端用facebook提供的XML标签库(XFBML) ,使你的页面代码更简洁

    <body>
      <script src="http://connect.facebook.net/en_US/all.js#xfbml=1"></script>
      <fb:like></fb:like>
    </body>

   3. 客户端获得JSON数据

<body>
      <div id="fb-root"></div>
      <script src="http://connect.facebook.net/en_US/all.js">
      </script>
      <script>
         FB.init({ 
            appId:'119449798131809', cookie:true, 
            status:true, xfbml:true 
         });
         FB.api('/me', function(user) {
           if(user != null) {
              var image = document.getElementById('image');
              image.src = 'http://graph.facebook.com/' + user.id + '/picture';
              var name = document.getElementById('name');
              name.innerHTML = user.name;
           }
         });
       </script>
           <div align="center">
           <img id="image"/>
           <div id="name"></div>
           </div>
    </body>

  4. 服务端获得JSON数据 (php)

<?php

define('YOUR_APP_ID', 'your app id ');
define('YOUR_APP_SECRET', 'your app secret');

function get_facebook_cookie($app_id, $app_secret) {
  $args = array();
  parse_str(trim($_COOKIE['fbs_' . $app_id], '\\"'), $args);
  ksort($args);
  $payload = '';
  foreach ($args as $key => $value) {
    if ($key != 'sig') {
      $payload .= $key . '=' . $value;
    }
  }
  if (md5($payload . $app_secret) != $args['sig']) {
    return null;
  }
  return $args;
}

$cookie = get_facebook_cookie(YOUR_APP_ID, YOUR_APP_SECRET);

$user = json_decode(file_get_contents(
    'https://graph.facebook.com/me?access_token=' .
    $cookie['access_token']));

?>
<html>
  <body>
    <?php if ($cookie) { ?>
      Welcome <?= $user->name ?>
    <?php } else { ?>
      <fb:login-button></fb:login-button>
    <?php } ?>
    <div id="fb-root"></div>
    <script src="http://connect.facebook.net/en_US/all.js"></script>
    <script>
      FB.init({appId: '<?= YOUR_APP_ID ?>', status: true,
               cookie: true, xfbml: true});
      FB.Event.subscribe('auth.login', function(response) {
        window.location.reload();
      });
    </script>
  </body>
</html>

MIME邮件格式的定义在 RFC5322中

http://tools.ietf.org/html/rfc5322

3.4. Address Specification

   Addresses occur in several message header fields to indicate senders

   and recipients of messages.  An address may either be an individual

   mailbox, or a group of mailboxes.

   address         =   mailbox / group

   mailbox         =   name-addr / addr-spec

   name-addr       =   [display-name] angle-addr

   angle-addr      =   [CFWS] "<" addr-spec ">" [CFWS] /

                       obs-angle-addr

   group           =   display-name ":" [group-list] ";" [CFWS]

   display-name    =   phrase

   mailbox-list    =   (mailbox *("," mailbox)) / obs-mbox-list

   address-list    =   (address *("," address)) / obs-addr-list

   group-list      =   mailbox-list / CFWS / obs-group-list