Monday, October 31, 2011

SSL certificate in Java Keystore

Untrusted Certificate?
If you get the ssl certificate from trusted public CA like Verisign, Thawte, digicert, GeoTrust etc, JRE and browsers will recognize it. However for some non-popular CA or home-issued certificate (for in-house testing purpose), JRE will not trust it. For instance, DST Root CA X3 isn't trusted by Java/Android platform even though most browsers trust it.

How do I fix this? 
Import certificate to Java Keystore.
First, save the certificate (*.cer).
Second, use keytool to import the Root certificate into your cacerts keystore.

Import certificate
The cacerts file is located in your JRE install directory under "<JRE_HOME>/lib/security/cacerts". The command to import will be similar to: $ keytool -keystore /opt/jre/lib/security/cacerts -storepass changeit -import -trustcacerts -v -alias DSTRootCAX3 -file dstRootCAX3.cer

Trust this certificate? [no]:  yes
Certificate was added to keystore
[Storing /usr/java/jre/lib/security/cacerts]

After above step done, restart services (Java process).

Verify imported certificate in keystore
C:\Program Files\Java\jdk1.7.0_01\jre>bin\keytool -list -keystore .\lib\security
\cacerts -storepass changeit -v > newstore.out
C:\Program Files\Java\jdk1.7.0_01\jre>notepad newstore.out

Alias name: verisignclass1g2ca
Creation date: Mar 25, 2004
Entry type: trustedCertEntry

Owner: OU=VeriSign Trust Network, OU="(c) 1998 VeriSign, Inc. - For authorized use only", OU=Class 1 Public Primary Certification Authority - G2, O="VeriSign, Inc.", C=US
Issuer: OU=VeriSign Trust Network, OU="(c) 1998 VeriSign, Inc. - For authorized use only", OU=Class 1 Public Primary Certification Authority - G2, O="VeriSign, Inc.", C=US
Serial number: 4cc7eaaa983e71d39310f83d3a899192
Valid from: Sun May 17 17:00:00 PDT 1998 until: Tue Aug 01 16:59:59 PDT 2028
Certificate fingerprints:
     MD5:  DB:23:3D:F9:69:FA:4B:B9:95:80:44:73:5E:7D:41:83
     SHA1: 27:3E:E1:24:57:FD:C4:F9:0C:55:E8:2B:56:16:7F:62:F5:32:E5:47
     SHA256: 34:1D:E9:8B:13:92:AB:F7:F4:AB:90:A9:60:CF:25:D4:BD:6E:C6:5B:9A:51:CE:6E:D0:67:D0:0E:C7:CE:9B:7F
     Signature algorithm name: SHA1withRSA
     Version: 1


*******************************************
*******************************************

Use -rfc to get certificate
C:\Program Files\Java\jdk1.7.0_01\jre>bin\keytool -list -keystore .\lib\security
\cacerts -storepass changeit -rfc

Alias name: verisignclass1g2ca
Creation date: Mar 25, 2004
Entry type: trustedCertEntry

-----BEGIN CERTIFICATE-----
MIIDAjCCAmsCEEzH6qqYPnHTkxD4PTqJkZIwDQYJKoZIhvcNAQEFBQAwgcExCzAJBgNVBAYTAlVT
MRcwFQYDVQQKEw5WZXJpU2lnbiwgSW5jLjE8MDoGA1UECxMzQ2xhc3MgMSBQdWJsaWMgUHJpbWFy
eSBDZXJ0aWZpY2F0aW9uIEF1dGhvcml0eSAtIEcyMTowOAYDVQQLEzEoYykgMTk5OCBWZXJpU2ln
biwgSW5jLiAtIEZvciBhdXRob3JpemVkIHVzZSBvbmx5MR8wHQYDVQQLExZWZXJpU2lnbiBUcnVz
dCBOZXR3b3JrMB4XDTk4MDUxODAwMDAwMFoXDTI4MDgwMTIzNTk1OVowgcExCzAJBgNVBAYTAlVT
MRcwFQYDVQQKEw5WZXJpU2lnbiwgSW5jLjE8MDoGA1UECxMzQ2xhc3MgMSBQdWJsaWMgUHJpbWFy
eSBDZXJ0aWZpY2F0aW9uIEF1dGhvcml0eSAtIEcyMTowOAYDVQQLEzEoYykgMTk5OCBWZXJpU2ln
biwgSW5jLiAtIEZvciBhdXRob3JpemVkIHVzZSBvbmx5MR8wHQYDVQQLExZWZXJpU2lnbiBUcnVz
dCBOZXR3b3JrMIGfMA0GCSqGSIb3DQEBAQUAA4GNADCBiQKBgQCq0Lq+Fi24g9TK0g+8djHKlNgd
k4xWArzZbxpvUjZudVYKVdPfQ4chEWWKfo+9Id5rMj8bhDSVBZ1BNeuS65bdqlk/AVNtmU/t5eIq
WpDBucSmFc/IReumXY6cPvBkJHalzasab7bYe1FhbqZ/h8jit+U03EGI6glAvnOSPWvndQIDAQAB
MA0GCSqGSIb3DQEBBQUAA4GBAKlPww3HZ74sy9mozS11534Vnjty637rXC0Jh9ZrbWB85a7FkCMM
XErQr7Fd88e2CtvgFZMN3QO8x3aKtd1Pw5sTdbgBwObJW2uluIncrKTdcu1OofdPvAbT6shkdHvC
lUGcZXNY8ZCaPGqxmMnEh7zPRW1F4m4iP/68DzFc6PLZ
-----END CERTIFICATE-----


*******************************************
*******************************************

Thursday, October 13, 2011

Wildcard SSL certificate

What is wildcard SSL certificate?
SSL certificates containing the wildcard character "*" in the CN of a server are called wildcard certificates. A "*" wildcard character MAY be used as the left-most name component in the certificate. For example, *.example.com would match a.example.com, foo.example.com, etc. but would not match example.com.

When to use wildcard SSL certificate?

1. Wildcard SSL certificate is good for one top domain but needs multiple sub domains, something like
a.example.com
b.example.com
www.example.com
foo.example.com
Instead of purchasing 4 SSL certificate, you can purchase one *.example.com wildcard SSL certificate.

2. Wildcard is good for many servers using different sub domains.

3. Wildcard doesn't support EV (extended verification), therefore if you need EV, you have to use regular certificate

What is the price?

Wildcard providers have 2 charge models: one is per server, the other is unlimited servers (See below for Pricing and providers, as of Oct 1, 2011, and the list is subject to change without notice, therefore always check providers' official website/sales rep for latest quote and product information)

Digicert.com $475 per year (3 years term, unlimited server)
http://www.digicert.com/ssl-certificate-comparison.htm

Thawte
the Wildcard certificate is $639 and every additional server you need it on would be $447. (3 years term has 15% discount)
[This info was from sales rep when I contacted them]
http://www.thawte.com/ssl/volume-discount-ssl-certificates/index.html

VeriSign - unknown (It is expensive, might be around $800)
http://www.verisign.com/ssl/buy-ssl-certificates/index.html?tid=a_box

GeoTrust Wildcard $446.00
http://www.geocerts.com/ssl/wildcard
http://www.geotrust.com/ssl/wildcard-ssl-certificates/

Godaddy is the cheapest $179.99
http://www.godaddy.com/ssl/ssl-certificates.aspx


One VIP multiple cert?
There seems no good answer for this question, different load balancers might have different behaviors, but F5 seems to support this from below article
http://devcentral.f5.com/Tutorials/TechTips/tabid/63/articleType/ArticleView/articleId/1086451/Multiple-Certs-One-VIP-TLS-Server-Name-Indication-via-iRules.aspx
And digicert seems to support multiple domain names in one wildcard certificate via SubjectAltName
http://www.digicert.com/ssl-support/wildcard-san-names.htm

Wednesday, October 12, 2011

iCalendar 101

This week we found an issue when send iCalendar via Microsoft Exchange server, recipients could not receive the meeting invite. The root cause was we didn't set value for MAILTO and CN parameters in Attendee property in VEVENT component for the core iCalendar object. It may be Exchange server specific requirement, like outlook requires UID and DTSTAMP parameters in *.ics file, because we didn't capture this defect when using simple POP3 email server.

I took this chance to get some general idea about iCalendar from google.com. (See a bunch of reference resources at the bottom). iCalendar is a standard (RFC 5545) for calendar data exchange, it is a file format which allows Internet users to send meeting requests and tasks to other Internet users. Two popular file extensions are *.ics (calendaring and scheduling information) and *.ifb (free or busy time information). iCalendar is designed to be independent of the transport protocol.

iCalendar always begins with BEGIN:VCALENDAR and ends with END:VCALENDAR which defines a core object. Within the iCalendar object, we can define some calendar properties and calendar components (VEVENT, VTODO, VJOURNAL, VFREEBUSY, VTIMEZONE, VALARM). One calendar property can have multiple parameters, and one calendar component can have multiple properties or sub components. This is kind of a tree structure to describe Internet Calendaring and Scheduling Core Object (see RFC5545 for details).

Example to explain ics object model:

BEGIN:VCALENDAR  -- starts iCalendar object
VERSION:2.0 -- calendar property
METHOD:PUBLISH
BEGIN:VTIMEZONE  -- starts VTIMEZONE component
TZID:India Standard Time -- component property
BEGIN:STANDARD -- starts sub component
DTSTART:16010101T000000  TZOFFSETFROM:+0530
TZOFFSETTO:+0530
END:STANDARD -- ends sub component
END:VTIMEZONE -- ends VTIMEZONE component
BEGIN:VEVENT -- starts another component (event)
DTSTART;TZID="India Standard Time":20111019T110000 -- componnet property, event starts time
DTEND;TZID="India Standard Time":20111019T120000 -- componnet property, event ends time
LOCATION;ENCODING=QUOTED-PRINTABLE:Webinar - See conference call information below -- component property, with parameters ENCODING
UID:100000000040827055 -- component property, required by outlook, unique ID for current event
DTSTAMP:20111012T172729Z - component property, required
DESCRIPTION: Click this link to join the Webinar
SUMMARY;ENCODING=QUOTED-PRINTABLE:Moving your data to the Cloud - Part 1
BEGIN:VALARM -- starts alarm component within VEVENT
TRIGGER:-PT15M -- component property, alarm trigger time
ACTION:DISPLAY
DESCRIPTION:Reminder
END:VALARM -- ends alarm sub component
END:VEVENT -- ends VEvent component
END:VCALENDAR -- ends iCalendar object

If we format above ics file with indent, it looks like

BEGIN:VCALENDAR  -- starts iCalendar object
    VERSION:2.0 -- calendar property
    METHOD:PUBLISH
    BEGIN:VTIMEZONE  -- starts VTIMEZONE component
        TZID:India Standard Time -- component property
        BEGIN:STANDARD -- starts sub component
            DTSTART:16010101T000000  TZOFFSETFROM:+0530
            TZOFFSETTO:+0530
        END:STANDARD -- ends sub component
    END:VTIMEZONE -- ends VTIMEZONE component
    BEGIN:VEVENT -- starts another component (event)
        DTSTART;TZID="India Standard Time":20111019T110000 -- componnet property, event starts time
        DTEND;TZID="India Standard Time":20111019T120000 -- componnet property, event ends time
        LOCATION;ENCODING=QUOTED-PRINTABLE:Webinar - See conference call information below -- component property, with parameters ENCODING, note that property suffix with semicolon instead of colon for this case
        UID:100000000040827055 -- component property, required by outlook, unique ID for current event
        DTSTAMP:20111012T172729Z - component property, required
        DESCRIPTION: Click this link to join the Webinar
        SUMMARY;ENCODING=QUOTED-PRINTABLE:Moving your data to the Cloud
        BEGIN:VALARM -- starts alarm component within VEVENT
            TRIGGER:-PT15M -- component property, alarm trigger time
            ACTION:DISPLAY
            DESCRIPTION:Reminder
        END:VALARM -- ends alarm sub component
   END:VEVENT -- ends VEvent component
END:VCALENDAR -- ends iCalendar object

About accept/tentative/decline an invite
In iCalendar, need set RSVP parameter in Attendee property, otherwise user will see "Save and Close".  Also Organizer property is to define the calendar component organizer. When do accept/decline/tentative, organizer will get the response.

http://tools.ietf.org/html/rfc5545#section-3.2.17
   Parameter Name:  RSVP
   Purpose:  To specify whether there is an expectation of a favor of a
      reply from the calendar user specified by the property value.

http://tools.ietf.org/html/rfc5545#section-3.8.4.3
   Property Name:  ORGANIZER
   Purpose:  This property defines the organizer for a calendar
      component.
   Value Type:  CAL-ADDRESS
   Property Parameters:  IANA, non-standard, language, common name,
      directory entry reference, and sent-by property parameters can be
      specified on this property.

Programming iCalendar

Prepare a multipart/alternative mail:
Part 1: text/html - this is displayed to ordinary mail readers (not support iCalendar) or as a fall-back and contains a summary of the event in human readable form

Part 2: text/calendar; method=xxx, holds the contents of the ics file (the header method parameter must match the method in the ics). Default encoding is UTF-8 in iCalendar
Part 3: Optional, attach the .ics file itself, so ordinary mail readers can offer the user something to click on. Outlook does not really require the attachment because it just reads the text/calendar part.

Code snippet using JavaMail
message.addHeaderLine("method=REQUEST");
message.addHeaderLine("charset=UTF-8");
message.addHeaderLine("component=VEVENT");

messageBodyPart.setHeader("Content-Class", "urn:content-classes:calendarmessage");
messageBodyPart.setHeader("Content-ID","calendar_message");
messageBodyPart.setDataHandler(new DataHandler(
new ByteArrayDataSource(buffer.toString(), "text/calendar")));//very important, buffer is ics file data

Use iCal4j
This open source project provides APIs for read and write ics files.


The following applications (calendar or email reader) already support iCalendar
  • Google Calendar
  • Apple iCal
  • Lotus Notes
  • Outlook 2000/2007/2010
  • Windows Live Calendar
  • Yahoo Calendar
  • Mozilla Thunderbird
  • SeaMonkey

References
:
http://en.wikipedia.org/wiki/ICalendar
http://tools.ietf.org/html/rfc5545 (rfc2445 was obsoleted by rfc5545 in 2009)
http://www.kanzaki.com/docs/ical/
http://build.mnode.org/projects/ical4j/project-info.html
http://stackoverflow.com/questions/461889/sending-outlook-meeting-requests-without-outlook

Friday, October 7, 2011

Open sources for big data analytics

Today I attended a webinar called "Big Data Technologies for Social Media Analytics" from Impetus Technologies. They introduced their iLaDaP platform built on top of a bunch of open source libraries. There were some case studies for financial/online retailer data analytic, but not very detailed. My take away from this webinar is - there are many open source projects surrounding Hadoop for big data analysis. Apart from simply adding them into your project, you need understand their pros and cons.

Hadoop
http://hadoop.apache.org/
The Apache Hadoop software library is a framework that allows for the distributed processing of large data sets across clusters of computers using a simple programming model. It is designed to scale up from single servers to thousands of machines, each offering local computation and storage. Rather than rely on hardware to deliver high-avaiability, the library itself is designed to detect and handle failures at the application layer, so delivering a highly-availabile service on top of a cluster of computers, each of which may be prone to failures.

Hadoop MapReduce
http://hadoop.apache.org/mapreduce/
Hadoop MapReduce is a programming model and software framework for writing applications that rapidly process vast amounts of data in parallel on large clusters of compute nodes.

Hadoop HDFS
http://hadoop.apache.org/hdfs/
Hadoop Distributed File System (HDFS™) is the primary storage system used by Hadoop applications. HDFS creates multiple replicas of data blocks and distributes them on compute nodes throughout a cluster to enable reliable, extremely rapid computations.

Hive
http://hive.apache.org/
Hive is a data warehouse system for Hadoop that facilitates easy data summarization, ad-hoc queries, and the analysis of large datasets stored in Hadoop compatible file systems.

Apache Pig
http://pig.apache.org/
Apache Pig is a platform for analyzing large data sets that consists of a high-level language for expressing data analysis programs, coupled with infrastructure for evaluating these programs.

Oozie
https://github.com/yahoo/oozie
Oozie - workflow engine for Hadoop

Sqoop
https://github.com/cloudera/sqoop/wiki
Sqoop is a tool designed to import data from relational databases into Hadoop.

Mahout
http://mahout.apache.org/
Scalable machine learning libraries. Mahout has implementations of a wide range of machine learning and data mining algorithms: clustering, classification, collaborative filtering and frequent pattern mining

Hbase
http://hbase.apache.org/
HBase is the Hadoop database. Use it when you need random, realtime read/write access to your Big Data.

Flume
https://github.com/cloudera/flume
Flume is a distributed, reliable, and available service for efficiently collecting, aggregating, and moving large amounts of log data.

Apache Camel
http://camel.apache.org/
Apache Camel is a powerful open source integration framework based on known Enterprise Integration Patterns with powerful Bean Integration.

NLTK: Natural Language Toolkit
http://www.nltk.org/
Open source Python modules, linguistic data and documentation for research and development in natural language processing and text analytics, with distributions for Windows, Mac OSX and Linux.

Impetus webinar presenter also mentioned two companies in this area.
Intellicus
http://www.intellicus.com/
Intellicus is one of the leading providers of next generation web-based Business Intelligence and Reporting solution,

Greenplum
http://www.greenplum.com/
Greenplum is the pioneer of Enterprise Data Cloud solutions for large-scale data warehousing and analytics.

Monday, October 3, 2011

ABR Streaming

Adaptive is a new keyword in WPO (Web performance optimization) blogs and articles. Adaptive image, adaptive video and adaptive streaming and more. The idea is to know the clients (browser, device, media player) difference (CPU, bandwidth, screen size, resolution, RTT etc) and serve different contents adaptive over HTTP. Instead of RTP (Realtime transport protocol), HTTP is CDN friendly solution because of more operational of Http servers on the edge.

ABR (Adaptive Bit Rate) video stream is to detect user's bandwidth and CPU in real time and adjust the quality of a video streaming accordingly. In 2006 Move Networks created this idea. They built a product which trans-rated videos into multiple versions of the same asset, encoded at different bit-rates. Further their product divided each video in many small chunks or “streamlets” each a few seconds long. They built a player which downloaded a video as a series of HTTP GET requests for sequential streamlets. The player continuously measured the available bandwidth so that the next GET request issued would be for a version of the streamlet best matched to measured available bit rate.

The chunked concept was very successful though Move Networks business was not. Apple, Microsoft and Adobe all implemented this ABR. Netflix (Video streaming) is using ABR too.

Apple HLS (Http Live Streaming), it works by breaking the whole stream into a sequence of small Http-based file downloads. As the stream is played, the client will select from a number of different bit-rate streams based on client CPU and bandwidth. M3U8 playlist is the first request, and it contains the metadata for various sub-streams.

Microsoft HSS (Http Smooth Streaming), it is a IIS media services extension to enable adaptive streaming of media to Silverlight and other clients over Http. HSS uses the simple concept of delivering small content fragments (typically 2 seconds video) and verifying that each has arrived within appropriate time and playback at the expected quality level. Based on the result, do adaptive delivery for next fragment. Manifest file is the first request which describes the fragment metadata to the client.

From above 2 implementations of ABR, we can see ABR solution needs client (player), Http streaming server and transcoder (to break whole content into small chunks in different bit rate) and also a manifest files for ABR metadata. Adaptive content serving or getting in real time provides good performance and user experience. We should be able to use similar idea in other WPO initiatives.

References:
http://en.wikipedia.org/wiki/Adaptive_bitrate_streaming
http://www.contentdeliverynews.com/?page_id=93