“More creative and varied”, this is how Brad Casemore describes the current nature of DDoS attacks in a recent article in the TechTarget website. Casemore, who is research director at International Data Corporation (IDC), said that the burden is on the shoulders of IT product/service vendors to come up with improved solutions for detection and mitigation of threats like DDoS.
The need for such solutions becomes even greater with the growing trend of encrypting network traffic, which increases the likelihood of abuse by hackers and create yet another vulnerability to security threats. This is the observation of Paul Nicholson, product marketing director at A10, a company that provides application networking technologies focusing on optimized performance of data center applications and networks.
What A10 has done lately is to produce what they call an anti-DDoS appliance branded as Thunder TPS (threat protection system). This product may be relevant only to large data centers at this time because this is apparently the user category A10 primarily had in mind when they designed Thunder TPS. Whatever. The important thing to note is that the idea of anti-DDoS appliance has been implemented and is now in the market.
Making data centers the environment model for Thunder TPS has been influenced by the escalating incidences of complex DDoS attacks against data centers and large enterprises as a whole. This is a blessing for the user community because, as it turns out, the resulting product implements a two-pronged approach to threat mitigation: breadth of attacks and size of attacks.
Like all other existing technology products designed for contending against security threats, Thunder TPS is not invincible. “Really big attacks could overwhelm it,” says security analyst Adrian Sanabria of 451 Research. Sanabria recommends pairing Thunder TPS with “something cloud-based or upstream”.
Nicholson gave some insights into the DDoS appliance’s attack prevention measure. Thunder TPS comes bundled with software that allows users to block attacks flexibly. Users can use regular expression rules; they can also program rules using the product’s aFlex tool.
In addition, Thunder TPS features “more robust SSL protection to validate whether clients attempting to access the network are legitimate or part of a botnet” (to use Nicholson’s words). The appliance can detect the presence and identity of potential threats through its access of “more than 400 destination-specific behavior counters”. Its software enables inspection of MPLS-encapsulated traffic and use of NAT (network address translation) as alternative to tunneling when the appliance moves sanitized traffic to other parts of the network.
Considering that Thunder TPS is data center oriented, users can expect that it is not a plug-and-play affair. They are likely to need their in-house IT experts to coordinate with the Thunder TPS deployment team, plus the help of external IT professionals if necessary.
A few posts back, we encountered the evolving term big data which describes gigantic mass of data that big business enterprises are eyeing to mine for whatever value can be obtained from the data.
Examples of big data may be found in the unimaginable collection of facts, figures, image/video/multimedia data that the Google search engines have piled up from 1997 to the present, as well as in the staggering amount of personal and related data that Facebook has collected from its more than 1.35 billion registered users worldwide since Mark Zuckerberg established it in 2004. Other organizations have their own sets of big data from their own sources.
The process of big data collection alone is itself is an enormous effort that requires backend support of data centers running on a 24/7 basis the whole year — and the advanced technology packed inside the data centers. With the extremely high cost of collecting big data, it is only natural for the business enterprise involved to recover that cost by making use of the Godzilla-sized data waiting to be tapped in the enterprise’s storage devices. An important step in using big data is data analytics, and this too requires the use of advanced technology.
Fortunately such a technology exists, thanks to hardware/software vendors and open-source software developers who are coming up with more powerful processing capability, increased levels of memory, advances in bandwidth, and highly distributed architectures that measure up to the challenge of big data.
One particular technology that stands out from the many offerings in the market is Apache Hive, which the Apache Software Foundation itself describes as “a data warehouse software (that) facilitates querying and managing large datasets residing in distributed storage“.
Hive does not work alone. It is built on top of — and works with — Apache Hadoop, an open-source software that allows distributed processing of large subsets of big data across clustered computers using simple programming models. Hadoop is designed for scalability; user organizations can start with single server machines and scale up to hundreds or thousands, and each machine is capable of local computation and storage. The Hadoop software library is designed for detecting and handling failures at the application layer. This means highly available service over clustered machines.
Hive has tools to easily extract, transform, and load subsets of big data that are stored in HDFS (Hadoop Distributed File System) or in other compatible storage systems such as Apache HBase. It can impose structure on various data formats, which makes it possible to query it using HiveQL (a query language that resembles SQL). The ability to query, in turn, provides the ability to analyze data and extract value out of it.
Data queries on Hive are done via Hadoop MapReduce, a software framework for easily writing applications which process multi-terabyte data sets in parallel on clusters consisting of thousands of nodes. Sequences of MapReduce programs are produced by a powerful data analysis platform working behind the scenes: Apache Pig. MapReduce and HDFS run in the same set of nodes.
Apache Hive and all the collaborating software need appropriate IT infrastructure to host them. Unless you have the necessary talent in your business, you need to see qualified IT professionals to help you plan infrastructure acquisition and configuration because there will be plenty of technical details to attend to before Apache Hive can make big data analytics a reality in your business.