United Business Media EE Times


Search

HOMEMARKET INTELLIGENCE UNITFORUMSDESIGNNEW PRODUCTSCAREERSBLOGSCONTACTEVENTSSIGN UP!RSSMost Popular contentTrusted Sources

 

Yahoo releases parallel software
IBM: Hadoop may fuel next wave of Web apps
Print this article Email this article Reprints RSS Digital Edition

EE Times


SANTA CLARA, Calif. — Web search pioneer Yahoo officially released its source code for Hadoop, a parallel programming framework many see as the key ingredient for cloud computing services. The software could fuel a broad class of future Web-based applications, said an IBM executive at the second-annual Hadoop Summit.

Hadoop is an open source version of MapReduce, the set of proprietary algorithms Google uses to run data-intensive applications such as Web search across large clusters of PC-based servers. Yahoo is a key a contributor to the open source project that attracted a packed house of more than 700 developers to the event here.

Yahoo is standardizing on Hadoop for a broad range of internal uses such as optimizing search results, the placement of Web content and ads and machine learning algorithms to filter spam in its email service. The company claims to be the world's largest production user of Hadoop with the code running on more than 25,000 server nodes.

"We've had a lot of requests for exactly what we are running with what patches, so here it is," said Eric Baldeschwieler, vice president of Hadoop development at Yahoo. "We think this will be good for the ecosystem," he said.

Yahooligans Eric Baldeschwieler, vice president of Hadoop development (left), and Owen O'Malley, software architect for grid computing, answer questions about the company's release of its Hadoop version 0.20.00 source code.

The initial code available on Yahoo's Web site is an alpha version of release 0.20. It includes a scheduler letting multiple users share a cluster via separate queues. In its production systems Yahoo currently uses Hadoop 0.18.3.

"The .18 branch has a lot of components with funny licenses that have been pushed out of .20, so this is obviously a more appealing offering," said Baldeschwieler.

Work is already beginning on a .21 release which aims to be the last to have any major changes in applications programming interfaces. "We hear about backwards compatibility all the time, so we need to make sure we don't break running apps on a cluster," said Baldeschwieler.

"Hadoop is setting the context for a new set of apps [the average business] couldn't get access to or were too expensive to write but now have a bright future," said Rod Smith, a vice president for emerging Internet technologies at IBM.

End users want to conduct custom Web searches, apply analysis to the results and conduct follow-on searches in repeated loops in areas ranging from financial analysis and medical research to retailing and fraud detection, Smith said. "Google has whetted people's appetite for customizable search," he added.

IBM is developing proof-of-concept tools for such apps that could run on more typical business computers with a few dozen nodes. The IBM effort, called M2, employs Hadoop as a back-end engine for a suite of browser-based analytical and visualization tools.

"We think tools like this will be helpful in collecting and extracting content and letting users run operations on it over and over again," Smith said. "But so far this is still cookie dough—it's not baked yet," he said.

Hadoop already has spun out seven new open source projects, only two of which are more than a year old, said Owen O'Malley, a software architect for grid computing at Yahoo. "When we started [in early 2006] this was a prototype that ran on 20 nodes as long as you said nice things to it every day," O'Malley said.

Executives from Hewlett-Packard, Intel and Yahoo said yesterday Hadoop is a key piece of a future open source software stack they hope to build to enable cloud computing. The software is also in use by companies including Facebook, Hulu, News Corp., Ning and Veoh.



Related Links:

  • Cover story: Gooogled
  • White paper: Multicore programming with LabView



  •   Free Subscription to EE Times
    First Name Last Name
    Company Name Title
    Email address
      Click here for your Free Subscription to EETimes Europe
     
    CAREER CENTER
    Looking for a new job?
    SEARCH JOBS
    SPONSOR

    RECENT JOB POSTINGS
    CAREER NEWS
    DoD Recognizes University Scientists For Basic Research
    Annual awards to university faculty to conduct next-generation research projects were announced this week by the Defense Department.

    For more great jobs, career related news, features and services, please visit EETimes' Career Center.



    All White Papers »   

     
    Education and
    Learning


    Learn Now:












    Home | About | Editorial Calendar | Feedback | Subscriptions | Newsletter | Media Kit | Contact | Reprints|  RSS|   Digital|  Mobile
    Network Websites
    International
    Network Features




    All materials on this site Copyright © 2010 TechInsights, a Division of United Business Media LLC All rights reserved.
    Privacy Statement | Terms of Service | About