Tuesday, 18 June 2013
The tech industry has been buzzing about “big data” for years now. And according to venture capital firm Accel Partners, the excitement around the big data space is not set to die down any time soon — it’s just about to enter into a new phase.
Accel is announcing tonight that it has closed on $100 million for a new investment fund called Big Data Fund 2. The fund is the same size as Accel’s first big data focused fund, which launched with $100 million back in November 2011.
As part of the new fund, Accel is also adding QlikView CTO Anthony Deighton and Imperva CEOShlomo Kramer to its Big Data Fund Advisory Council, which Accel has said is meant to serve as a “guiding light” to help think through investments and track entrepreneurs doing interesting things in the space.
Despite the nearly identical name, Accel’s Big Data Fund 2 will mark a definite shift in focus from the firm’s first big data fund, partner Jake Flomenberg said in a phone call today. “Over the past few years, we’ve focused a tremendous amount of attention on what people like to call the ‘three Vs’ of big data: variety, volume and velocity,” he said. “We now believe there is a fourth V, which is end user value, and that hasn’t been addressed to the same extent,” and that is where Big Data Fund 2 will be focusing the bulk of its investment and attention.
Specifically, Accel believes that “last mile” for big data will be served largely by startups focused on data-driven software, or “DDS.” These startups have largely been made possible through the hardware and infrastructure technology innovations that defined big data’s first wave, Flomenberg says. In a prepared statement from Accel, Facebook engineering VP Jay Parikh, who also serves on Accel’s Big Data Advisory Council, explained it like this:
“The last mile of big data will be built by a new class of software applications that enable everyday users to get real value out of all the data being created. Today’s entrepreneurs are now able to innovate on top of a technology stack that has grown increasingly powerful in the last few years – enabling product and analytical experiences that are more personalized and more valuable than ever.”
One example Flombenberg pointed to as an example of a “fourth V” DDS startup is RelateIQ, the “next generation relationship manager” software startup which launched out of stealth last week with some $29 million in funding from Accel and others.
Accel’s existing portfolio of big data investments also includes Cloudera, Couchbase, Lookout, Nimble Storage, Opower, Prismatic, QlikView, Sumo Logic, and Trifacta.
Sunday, 9 June 2013
Thursday, 6 June 2013
VMware has revealed its VMware vCloud Hybrid Service, an infrastructure as a service (IaaS) platform.
“VMware’s mission is to radically simplify IT and help customers transform their IT operations,” said Pat Gelsinger, CEO of VMware.
“Today, with the introduction of the VMware vCloud Hybrid Service, we take a big step forward by coupling all the value of VMware virtualisation and software-defined data centre technologies with the speed and simplicity of a public cloud service that our customers desire.”
vCloud Hybrid Service will extend VMware software, currently being used by hundreds of thousands of customers, into the public cloud. This means customers will be able to utilise the same skills, tools, networking and security models across both on-premise and off-premise environments.
“As a source of competitive advantage for our international business, our operations and IT department needs the agility and efficiency the public cloud promises,” says Julio Sobral, senior VP of business operations at Fox International.
“However, we don’t have the luxury of starting from scratch; we see in the vCloud Hybrid Service a potential solution to enable Fox International to have a more elastic platform that will support future deployments around the world. Working with technology partners like VMware gives us the best of both worlds by extending our existing infrastructure to realise the benefits of public cloud.”
According to the company, the vCloud Hybrid Service will allow customers to extend their data centres to the cloud and will support thousands of applications and more than 90 operating systems that are certified to run on vSphere. This means customers can get the same level of availability and performance running in the public cloud, without changing or rewriting their applications.
Built on vSphere, vCloud Hybrid Service offers automated replication, monitoring and high availability for business-critical applications, leveraging the advanced features of vSphere, including VMware vMotion, High Availability and vSphere Distributed Resources Scheduler.
“Our new VMware vCloud Hybrid Service delivers a public cloud that is completely interoperable with existing VMware virtualised infrastructure,” said Chris Norton, regional director at VMware for southern Africa.
“By taking an ‘inside-out’ approach that will enable new and existing applications to run anywhere, this service will bridge the private and public cloud worlds without compromise.”
According to VMware, the vCloud Hybrid Service will be available this month through an early access programme.
Monday, 3 June 2013
Hadoop is the poster child for Big Data, so much so that the open source data platform has become practically synonymous with the wildly popular term for storing and analyzing huge sets of information.
While Hadoop is not the only Big Data game in town, the software has had a remarkable impact. But exactly why has Hadoop been such a major force in Big Data? What makes this software so damn special - and so important?
Sometimes the reasons behind something success can be staring you right in the face. For Hadoop, the biggest motivator in the market is simple: Before Hadoop, data storage was expensive.
Hadoop, however, lets you store as much data as you want in whatever form you need, simply by adding more servers to a Hadoop cluster. Each new server (which can be commodity x86 machines with relatively small price tags) adds more storage and more processing power to the overall cluster. This makes data storage with Hadoop far less costly than prior methods of data storage.
(See also Hadoop: What It Is And How It Works.)
Spendy Storage Created The Need For Hadoop
We're not talking about data storage in terms of archiving… that's just putting data onto tape. Companies need to store increasingly large amounts of data and be able to easily get to it for a wide variety of purposes. That kind of data storage was, in the days before Hadoop, pricey.
And, oh what data there is to store. Enterprises and smaller businesses are trying to track a slew of data sets: emails, search results, sales data, inventory data, customer data, click-throughs on websites… all of this and more is coming in faster than ever before, and trying to manage it all in a relational database management system (RDBMS) is a very expensive proposition.
Historically, organizations trying to manage costs would sample that data down to a smaller subset. This down-sampled data would automatically carry certain assumptions, number one being that some data is more important than other data. For example, a company depending on e-commerce data might prioritize its data on the (reasonable) assumption that credit card data is more important than product data, which in turn would be more important than click-through data.
Assumptions Can Change
That's fine if your business is based on a single set of assumptions. But what what happens if the assumptions change? Any new business scenarios would have to use the down-sampled data still in storage, the data retained based on the original assumptions. The raw data would be long gone, because it was too expensive to keep around. That's why it was down-sampled in the first place.
Expensive RDBMS-based storage also led to data being siloed within an organization. Sales had its data, marketing had its data, accounting had its own data and so on. Worse, each department may have down-sampled its data based on its own assumptions. That can make it very difficult (and misleading) to use the data for company-wide decisions.
Hadoop: Breaking Down The Silos
Hadoop's storage method uses a distributed filesystem that maps data wherever it sits in a cluster on Hadoop servers. The tools to process that data are also distributed, often located on the same servers where the data is housed, which makes for faster data processing.
Hadoop, then, allows companies to store data much more cheaply. How much more cheaply? In 2012, Rainstor estimated that running a 75-node, 300TB Hadoop cluster would cost $1.05 million over three years. In 2008, Oracle sold a database with a little over half the storage (168TB) for $2.33 million - and that's not including operating costs. Throw in the salary of an Oracle admin at around $95,000 per year, and you're talking an operational cost of $2.62 million over three years - 2.5 times the cost, for just over half of the storage capacity.
This kind of price savings mean Hadoop lets companies afford to hold all of their data, not just the down-sampled portions. Fixed assumptions don't need to be made in advance. All data becomes equal and equally available, so business scenarios can be run with raw data at any time as needed, without limitation or assumption. This is a very big deal, because if no data needs to be thrown away, any data model a company might want to try becomes fair game.
That scenario is the next step in Hadoop use, explained Doug Cutting, Chief Architect ofCloudera and an early Hadoop pioneer. "Now businesses can add more data sets to their collection," Cutting said. "They can break down the silos in their organization."
More Hadoop Benefits
Hadoop also lets companies store data as it comes in - structured or unstructured - so you don't have to spend money and time configuring data for relational databases and their rigid tables. Since Hadoop can scale so easily, it can also be the perfect platform to catch all the data coming from multiple sources at once.
Hadoop's most touted benefit is its ability to store data much more cheaply than can be done with RDBMS software. But that's only the first part of the story. The capability to catch and hold so much data so cheaply means businesses can use all of their data to make more informed decisions.
Thursday, 7 March 2013
Brazil, Russia, India and China still lag far behind developed countries in policies considered critical for the future of cloud computing, but each made some progress over the past year, a US industry group said.
The Business Software Alliance, which represents US industry heavyweights such as Microsoft, said the BRIC nations all came in at the bottom half of 24 countries surveyed in its second annual cloud computing report.
Brazil moved from final position to 22nd with a tally of 44.1 out of a possible 100 points.
China, India and Russia each also rose two slots with scores of 51.5, 53.1 and 59.1, respectively.
Cloud computing refers to providing software, storage, computing power and other services to customers from remote data centers over the Web.
Demand for cloud-based software is rising rapidly because the approach allows companies to start using new programs faster and at lower cost than traditional products that are installed at a customer's own data center.
"The cloud is really the hot sector of IT right now," and US companies have a big interest in countries harmonizing policies instead of chopping the cloud into pieces, said Robert Holleyman, president of the Business Software Alliance.
At the same time, the aggregation of massive amounts of data in large data centers "creates new and highly tempting targets" for cyber attacks, making it vital that both law enforcement officials and cloud providers have adequate tools to fight the intrusions, the BSA report said.
"Australia, France, Germany, and Japan score extremely highly in the cybercrime section. Canada, China, (South) Korea, Russia, and Vietnam score poorly. The country that shows the most improvement is Brazil, which finally passed cybercrime laws after a long campaign," the report said.
The 24 countries included in the survey represent 80 per cent of the global information and communications technology industry. They were assessed in seven areas, including data privacy, security, free trade, intellectual property protection, infrastructure and support for industry-led standards to promote smooth data flows.
China got a small boost in this year's rating for introducing new data privacy laws, while Russia got credit for reforms made as a result of its entry into the World Trade Organization. India's improved score reflects changes to its copyright laws to bring them in line with international standards, the report said.
Japan came in first again with 84.1 points. It was followed closely by other developed countries, including Australia, the United States, Germany, Singapore, France, Britain and South Korea, which all scored in the upper 70s.
Singapore jumped to fifth place, from tenth last year, after it passed a new data privacy law praised by BSA for its "light touch" and balanced approach.
"They are really taking on digital trade as another way of putting a stake in the ground and to say they are going to be global hub of business," Holleyman said.
The United States finished second in the survey, up from third in the inaugural report, while Germany, France and Britain each slipped a notch and Italy fell four spots.
Holleyman said the European Union was working on data protection regulations that could potentially make it harder to move data across its borders.
"If that happens I think you can continue to see further sliding by the major European countries," Holleyman said.
Talks on a US-EU free trade agreement are expected to start by June, he said.
Cross-border data flows are already a focus in talks on the Trans-Pacific Partnership (TPP), a proposed regional free trade agreement between the United States and ten other countries in the Asia-Pacific slated for conclusion this year.
One of the TPP countries, Vietnam, finished last in this year's cloud computing scorecard, with a tally of 40.1 points.
Vietnam, Indonesia, China and India have pursued policies that threaten to divide the cloud, either by trying "to wall themselves off or by imposing local requirements that are antithetical to the very underpinning of cloud computing," Holleyman said.
Sunday, 3 March 2013
The government has allowed IT and ITeS special economic zones to set up disaster recovery centres outside their limits at any part of the country, meeting the long-pending demand of the industry.
Issuing the guidelines for setting up of disaster recovery centres (DRC) and business continuity plan (BCP) for IT/ITeS special economic zones, the Commerce and Industry Ministry said the locations for such facilities will be approved by the respective development commissioner.
"The DRC/BCP location will be approved by the development commissioner (DC) on an application made by the SEZ unit. Such approval will allow the unit to relocate its operations, data and employees to the DRC/BCP location upon the occurrence of a disaster," it said.
However, it said that as this activity is envisaged as a purely internal exercise to be carried out across branches of the same SEZ entity to ensure that business continuity, there will be no commercial activity involved and accordingly, no commercial invoice will be raised in such movement of data, operations and employees.
"It was a long pending demand of the industry. IT/ITeS SEZs need such facilities at the time of any type of disasters. It will certainly help the sector," an official said.
Prevention and creating data back up is an integral part of the sectors' DR/BCP strategy.
"The data are regularly backed up at locations which are isolated from the main business centres to prevent loss in the event of a disaster. This would entail movement of data from SEZ to a DR/BCP location outside the SEZ and movement of storage media back into the zone," it said.
It also said that movement of data from outside the zone would not be treated as exports besides a record of movement of magnetic, storage tapes and devices would be maintained at the tax free enclave.
However, the unit would have to pay necessary duty on the tapes and storage tapes on which the data is being moved.
It said that the back up location where the "devices are moved could be a location under another SEZ or export oriented units i.e. a bonded secured location".
Further, the guidelines have comprehensively defined the term 'disaster' and has classified it into categories - natural and manmade. The manmade disasters include hazardous material spills, infrastructure failure or bio-terrorism.
It has also provided norms for setting up of these centres by a third party client.
The move assumes significance as out of over 160 operational SEZs, about half of them relate to IT/ITeS SEZs.