You are reading the article Ethereum (Eth) Gains 10% To Spark Altcoin Season Post updated in December 2023 on the website Kientrucdochoi.com. We hope that the information we have shared is helpful to you. If you find the content interesting and meaningful, please share it with your friends and continue to follow and support us for the latest updates. Suggested January 2024 Ethereum (Eth) Gains 10% To Spark Altcoin Season Post
Ethereum (ETH), the biggest altcoin, is leading today’s rally, outshining its biggest competitor Bitcoin (BTC), which has dominated the news since the start of the year.
Over the last 24 hours, the price of ETH has jumped by more than 10% as altcoins shine in the crypto market rally. Regarding gains, Collateral Network (COLT), a new decentralized lending platform, has also seen its token jump 40% while still in presale.
Since the start of the 2023 bull market, Ethereum (ETH) has played second-fiddle to Bitcoin (BTC), which has been the biggest gainer. However, an altcoin season rally could start, with Ethereum (ETH) gaining over 10% in the last 24 hours.
The 24-hour price chart for Ethereum (ETH). Source: CoinMarketCap
While Ethereum (ETH) is up almost 10% at the time of writing, BTC has gained only 2%. As with previous altcoin seasons, major altcoins like Ethereum (ETH) have been posting better returns than Bitcoin this season.
In the last 24 hours, the top ten altcoins all have better returns than Bitcoin. As per data from CoinMarketCap, Binance Coin (BNB) is +3%, XRP is +4.8%, Cardano (ADA) is +7.2%, Dogecoin (DOGE) is +4.1%, Polygon (MATIC) is +4.5%, and Solana (SOL) +3.1%.
With signs of the altcoin season already showing, Ethereum (ETH) looks poised to lead the rally as it just concluded its “Shapella” upgrade. Popular trader Credible Crypto confirms the altseason, saying that altcoins have “started the journey to new all-time highs.”
Collateral Network (COLT) Surges While in PresaleWhile the leading altcoins are just starting to rally, Collateral Network (COLT), a decentralized crowdlending platform still in its presale, has seen its price jump by 40% and is poised to grow even further.
At the start of the presale in February, $COLT was initially valued at $0.01 but currently trades at $0.014. According to data provided by the official Collateral Network (COLT) website, the price of COLT will hit $0.0168 when an additional 120,000,000 COLT are sold.
There is so much buzz around the launch of the Collateral Network (COLT) because it is set to be the first fractional NFT platform in the world that simplifies the process of taking loans, through the use of physical assets and smart contracts on a blockchain.
Collateral Network (COLT) will allow borrowers on the platform to mint fractionalized NFTs backed by their real-world assets, which are then funded by multiple lenders. These NFTs will be given the value of the assets that they are backed by. Cars, jewelry, real estate and fine art are all examples of viable tangible assets.
Lenders, in return, receive passive income from a fixed interest rate while a borrower pays off their loan.
To create transparency, Collateral Network (COLT) will store the information about the physical asset backing each NFT is stored in the metadata, which will be publicly available on-chain and tamper-proof, meaning it cannot be changed.
How to Buy COLTYou can purchase COLT using BTC, ETH, BNB, USDT, SOL, SAND, MANA, DOGE, or SHIB. The chain will automatically convert them to an equivalent in $COLT.
ConclusionWith the Ethereum (ETH) upgrade and upcoming staked token unlock coinciding with the bottoming of altcoins, the world’s biggest smart contracts platform looks poised to lead the incoming alt season.
Meanwhile, the Collateral Network (COLT) presale is a great opportunity for investors to get involved in a project which can revolutionize the crowdlending industry forever. Read more about the project on the official Collateral Network (COLT) website.
Find out more about the Collateral Network presale here:You're reading Ethereum (Eth) Gains 10% To Spark Altcoin Season Post
Spark For Mac Gains Labels, Better Folder Management, Smarter Search & More
Prolific Ukrainian developer Readdle today pushed a major update to its award-winning macOS email client, Spark, bringing new features such as labels, improved folder management, smart filters the ability to save emails in Drafts manually and other improvements that will make you love email again.
As I wrote before, Spark is (in my personal opinion) hands down the best email client I’ve used on my Mac. Spark 1.2 for macOS is available at no charge from Mac App Store.
The app is also available for iPhone, iPad and Apple Watch.
Here’s everything new in Spark 1.2 for Mac.
Introducing labels“We know how powerful labels can be while managing a busy or shared inbox and believe we’ve finally found the best solution,” wrote developers. To use labels in your Spark workflow, you must first enable the new Show Labels in List option in the app’s preferences.
Spark 1.2 lets you create new email labels with ease or use your existing Gmail ones. A new Color Tags feature, shown above, provides additional context to custom tags you apply to individual messages.
Improved folder managementSpark has always allowed you to file emails into folders. With reinvented folder management in Spark 1.2, you can put your inbox in order more easily than ever before by cherry-picking the folders to be displayed in the app’s tweaked sidebar.
You can also now reorder folders with drag and drop.
As part of the update, you can use Color Tags, Recents, Favorites and Smart Folders with natural language filters (i.e. “Emails from Seb with PDF files”) to manage your folders and emails.
Tidbits: counter badge for folders, better smart search & moreIf you’re using mail rules to sort your emails, you can see how many unread items you have in each folder by enabling the message count badge for folders on the Spark app’s icon in your Dock. To turn this feature on, open Spark’s preferences and go to Message count for other folders → New Emails/All Emails.
Last but not least, you can now save draft emails manually in Spark. Additionally, contacts suggestions in the TO field work much better now and they’ve also redesigned BCC options in Spark’s settings.
Bug fixesAside from new features, the team has fixed a bunch of issues:
Fixed an issue with old emails not loading
Fixed an issue with Japanese characters
Fixed the crash while customizing a Touch Bar
Fixed the issue with Return button when moving an email
Fixed the issue with color coding when ‘None’ is selected
Fixed Reply All shortcut issue
Fixed UI issue with labels in mail list
Fixed rare crashes in Smart Inbox settings
Fixed signature appearance issue
Fixed email drag and drop behavior
Fixed search suggestions issue
Fixed crash on message deletion
Fixed issues with some HTML emails
Fixed Drafts saving
The launch speed is now faster than ever
Fixed memory consumption and CPU usage issues
“We’ve also got big news to be announced soon,” teased Readdle. Don’t worry, we’ll keep you in the loop so stay tuned to iDownloadBlog for more Spark coverage.
Watch our Spark hands-on videoWondering why all the fuss?
Be sure to watch my colleague Andrew O’Hara’s hands-on video below.
Subscribe to iDownloadBlog on YouTube
Have you used Spark before and if so, how do you like the app? Speaking of which, what’s your favorite email client for macOS? How about your iPhone and iPad? I use Spark across my iPhone, iPad, Mac and Apple Watch and couldn’t be happier with it—Spark is all I’ve ever wanted from a cross-platform email app without being a resource hog or inundated with lesser-used features.
Spark for Mac and Spark for iPhone and iPad are available free on App Store.
Ethereum Tutorial For Beginners: What Is Ethereum Blockchain?
What is Ethereum?
Ethereum is an open-source operating system that offers smart contract functionality. It is a distributed computing platform that supports developing decentralized Digital Applications (DApps) using blockchain technology. Ethereum provides a decentralized virtual machine called Ethereum Virtual Machine (EVM) that can run scripts using an international network of public nodes.
Ethereum is the biggest decentralized software app. It helps you to build smart contracts and decentralized applications without any downtime or any third-party interference. Ethereum allows the developer to create and publish next-generation distributed applications.
In this Ethereum tutorial for beginners, you will learn Ethereum basics like:
Why do you need Ethereum?Centralized systems are one of the most widespread models for software applications. This system directly controls the operation of the individual units and the flow of information from a single center. In this kind of system, individuals are depended on the central power to send & receive information.
However, there are issues with the centralized system are:
Single point of control & failure
It can be corrupt easily
Performance bottleneck
Silo effect
The Solution is Decentralized ApplicationsDecentralized applications never reply on a centralized backend, but they interact directly with a blockchain. Refer this tutorial to learn more about BlockChain.
The term DApp is a combination of two words- decentralized applications. In simple words, it is an application, tools, or programs that work on the decentralized Ethereum Blockchain.
History of Ethereum
2013: Vitalik Buterin, a developer who was involved in Bitcoins, and he was first to describe on paper
2014: A Swiss firm Ethereum Switzerland Gmbh developed the first Ethereum software project
2023: Frontier, the first version of Ethereum was launched.
On March 14, 2023: A planned protocol Homestead becomes second biggest version upgrade of the ethereum network.
On May 2023: Ethereum gets the most extensive media coverage when the DAO raised a record $150 million in crowd sale.
On July 2023: The network branched into two broad categories: Ethereum (ETH) and Ethereum Classic (ETC).
June 2023: Ethereum rallies above $400 recording a 5001% rise since Jan 1st, 2023
May 2023- Ethereum will eventually overtake the success of Bitcoins
June 2023- The DAO was hacked by an anonymous group claiming $50 worth of ETH.
What is Smart Contract?
A Smart Contract is a computer program that executes automatically. It is a transaction protocol that allows blockchain users to exchange money and property. It also helps users to perform actions like voting without any central authority. It is a virtual third-party software agent that can automatically execute and enforce terms and actions according to the legal agreement.
How Smart Contracts Work?
Traditional Contracts vs. Smart ContractsBelow is the difference between traditional contracts and smart contracts:
Parameter Traditional Smart contracts
Duration 103 Days Minutes
Remittance Manual Automatic
Escrow Necessary Necessary
Cost Expensive Fraction of the cost
Presence Physical presence Virtual presence
Lawyers Lawyers are important Lawyers may not be necessary
Key Terms in Ethereum
Currency Issuance: It is mostly managed and monitored by a country’s central bank. It is also referred to as a monetary authority.
Decentralized Autonomous: Decentralized Autonomous Organization is a digital organization which aims to operate without the need for hierarchical management.
Organizations (DAO): DAO is a combination of computer code, a blockchain, smart contracts, and people.
Smart Contracts: It is digitally signed agreement between two or more parties which relies on a consensus system
Smart Property: The Ethereum Wallet is a gateway to decentralized applications on the Ethereum blockchain. It helps you to hold and secure ether and other crypto-assets which are built on Ethereum.
Solidity: Solidity is the smart contract language used in Ethereum. It is general purpose programming language developed to run in the EVM environment. Solidity helps you to perform arbitrary calculations. However, it aims to send & receive digital token and store states.
Transactions: A transaction is a message which is sent from one account to another account that might be the same or empty. It can include binary data which is called Ether.
Ethereum Virtual Machine: The Ethereum Virtual Machine which is also known as EVM is the runtime environment for smart contracts. EVM is a computer layer straight above the underlying hardware. It is not just sandboxed but isolated. Moreover, the code running inside the EVM doesn’t have any access to network, filesystem or any other processes.
What is Ether?What is Ether?
GasTo perform a transaction on the Ethereum network, a user requires to make a payment (to the miner) Ether via an intermediary token called ‘Gas.’ It is a unit which allows you to measures the computational work required for running a smart contract or other transactions.
In Ethereum, the transactions fee is calculated in Ether, which is given as
Ether = Tx Fees= Gas Limit * Gas PriceWhere,
Gas Limit= Refers to the amount of gas that is used for the computation
Gas Price= The amount of Ether a user is required to pay
Typical Ethereum Network Transaction
Ethereum vs. BitcoinHere is the main difference between Ethereum and Bitcoin:
Parameter Bitcoin Ethereum
Definition Bitcoin is a digital money Ethereum is a world computer.
Founder Satoshi Nakamoto Vitalik Butarrn
Hashing algorithms Bitcoin used SHA-256 algorithm. Ethereum uses Etash algorithm.
Average Block time 10 minutes 10-15 sec
Release Date 9 Jan 2008 30 July 2023
Release Method Genesis Block Mind Prasala
Blockchain Proof of work Proof of work (Planning for POS)
Usage Digital Currency Digital Currency
Cryptocurrency Used Bitcoin(Satoshi) Ether
Blocks Time 10 Minutes 12-14 Seconds
Mining ASIC miners GPUs
Scalable Not now Yes
Concept Digital money World Computer
Cryptocurrency Token BTC Ether
Turing Turing incomplete Turing complete
Coin Release Method Early mining Through ICO
Protocol Bitcoin still employs the pool mining concept. It uses a Ghost Protocol.
Next in this Ethereum tutorial, we will learn about applications of Ethereum.
Applications of Ethereum
Below are the applications of Ethereum:
Banking: With Ethereum’s decentralized system. It is almost impossible for a hacker to have unauthorized access to an individual’s personal information.
Agreements: By using a smart contract, agreements can be maintained and executed without any alteration.
Prediction market: The prediction market is another wonderful use case of Ethereum Smart Contract. The platforms like Gnosis and Augur use Ethereum for this purpose.
Digital Identity Management: Digital identities can be managed by using smart contracts which solve the major issues of identity theft and data monopoly.
Advantages of Ethereum
Allows you to upload and request programs to be executed.
100% uptime and DDOS resistant.
Ethereum helps you to create a tradable token that you can use as a new currency or virtual share.
Persistent and permanent data storage.
Build virtual organizations.
Helps you to develop decentralized applications.
Ethereum helps you to build fault-tolerant and highly secure decentralized apps.
The Ethereum Virtual Machine is slow, so you can’t use it for large computations.
Storage on the blockchain is expensive.
Swarm Scalability is an issue, so there is a trade-off with decentralization Private block chains are likely to proliferate.
Fixing bugs or updating Apps is a tough task because every peer in the network need to update their node software.
Some applications require verification of user identity, and as there is no central authority to verify the user identity.
If you want to learn about creating your own cryptocurrency, here’s a free tutorial you’ll want to check out: How to Create Your Own Cryptocurrency?
Summary
Ethereum meaning: Ethereum is an open source software framework which is based on blockchain technology.
Ethereum helps you to build smart contracts and decentralized applications without any downtime or any third-party interference.
Ethereum was launched in 2013 by developer Vitalik Buterin.
Smart contracts allow blockchain users to exchange money and property. Mining Ethereum can be used for Smart Contracts as well as Digital Currency.
Ether is a value token of the Ethereum blockchain. It is listed as “ETH” on cryptocurrency exchanges.
To perform a transaction on the Ethereum network, a user requires to make a payment (to the miner) Ether via an intermediary token called ‘Gas.’
Ethereum ensures 100% uptime and DDOS resistant.
Fixing bugs or updating Apps in the Ethereum network is a tough task because every peer in the network need to update their node software.
Introduction To Aggregation Functions In Apache Spark
This article was published as a part of the Data Science Blogathon.
IntroductionAggregating is the process of getting some data together and it is considered an important concept in big data analytics. You need to define a key or grouping in aggregation. You can also define an aggregation function that specifies how the transformations will be performed among the columns. If you give multiple values as input, the aggregation function will generate one result for each group. Spark’s aggregation capabilities are sophisticated and mature, with a variety of different use cases and possibilities. Aggregations are generally used to get the summary of the data. You can count, add and also find the product of the data. Using Spark, you can aggregate any kind of value into a set, list, etc. We will see this in “Aggregating to Complex Types”.
We have some categories in aggregations.
Simple Aggregations
The simplest grouping is to get a summary of a given data frame by using an aggregation function in a select statement.
Grouping Aggregations
A “group by” allows you to specify more than one keys or aggregation function to transform the columns.
Window functions
A “window” provides the functionality to specify one or more keys also one or more aggregation functions to transform the value columns. However, the input rows to the aggregation function are somewhat related to the current row.
All these aggregations in Spark are implemented via built-in functions.
In this article, I am going to discuss simple aggregations.
PrerequisitesHere, I am using Apache Spark 3.0.3 version and Hadoop 2.7 version. It can be downloaded here.
I am also using Eclipse Scala IDE. You can download it here.
I am using a CSV data file. You can find it on the github page.
The data set contains the following columns.
station_id, name, lat, long, dockcount, landmark, and installation.
This is bike station data.
Importing FunctionsI am importing all functions here because aggregation is all about using aggregate functions and window functions.
This can be done by using
import org.apache.spark.sql.functions._Now I am reading the data file into a data frame.
Simple AggregationsNow, we are ready to do some aggregations. Let’s start with the simplest one.
The simplest form of aggregation is to summarize the complete data frame and it is going to give you a single row in the result. For example, you can count the number of records in this data frame and it will return you a single row with the count of records.
Now, we start with the data frame and use the select() method and apply the count function. You can also hive alias to the summary column. You can also add one more summary column for the sum of the dockcount column. You can also compute the average. We also have countDistinct() function. Here, I am counting the unique values of the landmark column. The countDistinct() will give the number of the unique landmark in this data frame. There is another thing called approx_count_distinct(). When we give countDistinct(), it will group the distinct values and count them. What happens when we have a huge dataset with millions of rows. The countDistinct() function will take time. In that case, we can use approx_count_distinct() which will return an approximate count. It is not 100% accurate. We can use this when speed is more important than accuracy. When you want to get the sum of a distinct set of values, you can use the sumDistinct() function.
be implemented like this.
df.select( count("*").as("Count *"), sum("dockcount").alias("Total Dock"), avg("dockcount").alias("avg dock"), countDistinct("landmark").alias("landmark count"), approx_count_distinct("station_id").alias("app station"), sumDistinct("station_id").alias("station_id") ).show()The select method will return a new data frame and you can show it.
Let me run this.
The output will be as follows.
So, as expected, we summarized the whole data frame and got one single row in the result.
Great!
We have many other aggregation functions like first() and last() where you can get the first and last values in a data frame. We can get the minimum and maximum values using min() and max() functions respectively.
This can be done in Scala like this.
df.select( first("station_id").alias("first"), last("station_id").alias("last"), min("dockcount").alias("min"), max("dockcount").alias("max") ).show()When we execute this, we will get the following output.
Now, I am going to use selectExpr() where we can pass the SQL like expressions.
df.selectExpr( "mean(dockcount) as mean_count" ).show()Here, I am calculating the mean of the dockcount column.
The mean value is displayed.
Variance and Standard DeviationLet’s look into other aggregate functions like variance and standard deviation. As we all know variance is the average of squared differences from the mean and standard deviation is the square root of variance.
They can be calculated by
df.select( var_pop("dockcount"), var_samp("dockcount"), stddev_pop("dockcount"), stddev_samp("dockcount") ).show()And the output is
Skewness and KurtosisSkewness is the degree of distortion from the normal distribution. It may be positive or negative. Kurtosis is all about the tails of the distribution. It is used to find outliers in the data.
It can be identified by
df.select( skewness("dockcount"), kurtosis("dockcount") ).show()The output is
Covariance and CorrelationNext, we will see about covariance and correlation. Covariance is the measure of how much two columns or features or variables vary from each other. Correlation is the measure of how much they are related to each other.
It can be calculated by
df.select( corr("station_id", "dockcount"), covar_samp("station_id", "dockcount"), covar_pop("station_id", "dockcount") ).show()The output is
Aggregating to complex typesNext, we will see about aggregating to complex types. Suppose if you want to store a particular column in a list or if you need unique values of a column in a list, you can use collect_list() or collect_set(). collect_set() will store the unique values and collect_list() will contain all the elements.
Here is the implementation.
df.agg(collect_set("landmark"), collect_list("landmark")).show(false)The output is
Complete CodeHere is the entire implementation.
import org.apache.spark.sql.functions._ import org.apache.spark.SparkContext import org.apache.spark.SparkConf object demo extends App{ val conf = new SparkConf().setAppName("Demo").setMaster("local[1]") val sc = new SparkContext(conf) val spark = org.apache.spark.sql.SparkSession.builder.master("local[1]").appName("Demo").getOrCreate; df.select( count("*").as("Count *"), sum("dockcount").alias("Total Dock"), avg("dockcount").alias("avg dock"), countDistinct("landmark").alias("landmark count"), approx_count_distinct("station_id").alias("app station"), sumDistinct("station_id").alias("station_id") ).show() df.select( first("station_id").alias("first"), last("station_id").alias("last"), min("dockcount").alias("min"), max("dockcount").alias("max") ).show() df.selectExpr( "mean(dockcount) as mean_count" ).show() df.select( var_pop("dockcount"), var_samp("dockcount"), stddev_pop("dockcount"), stddev_samp("dockcount") ).show() df.select( skewness("dockcount"), kurtosis("dockcount") ).show() df.select( corr("station_id", "dockcount"), covar_samp("station_id", "dockcount"), covar_pop("station_id", "dockcount") ).show() df.agg(collect_set("landmark"), collect_list("landmark")).show(false) } End notesSo, these are all simple aggregations. The simple aggregations will always give you a one-line summary. Sometimes, you may want a detailed summary. For example, if you want to combine two or more columns and apply aggregations there. It can be done simply by using Spark SQL. But you can do the same using data frame expressions also. It can be done by the concept of grouping aggregations. I will discuss grouping aggregations in another article. You can find it here.
The media shown in this article is not owned by Analytics Vidhya and are used at the Author’s discretion.
Related
Top 10 Impacts Of Ethereum And Nfts On The Art World
The adoption of Ethereum and NFTs by artists, collectors, and art institutions to produce, market, and acquire digital art has a profound impact on the art industry. The ownership of a distinctive piece of content, such as digital artwork, music, film, or even tweets, is represented by NFTs, which are digital tokens built on the blockchain.
Top 10 Impacts of Ethereum and NFTs on the Art WorldNFTs have created new opportunities for artists and collectors alike, challenging traditional notions of art ownership and value. In this article, we will discuss the Impacts of Ethereum and NFTs in the art world.
The Rise of Digital Art
The development of digital art has also been aided by Ethereum and NFTs. Digital art used to frequently be thought of as being inferior to more traditional kinds of art, such as painting or sculpture. NFTs, however, have given digital art a fresh sense of legitimacy and worth.
The Democratization of Art Ownership
The democratization of art ownership is one of Ethereum and NFTs’ most important effects on the art world. In the past, only the affluent elite could afford to purchase and display works of art in their homes or private collections. Yet, NFTs have made it possible for anyone, regardless of their financial situation, to own a work of digital art.
Increased Transparency in Art Transactions
The transparency of art transactions has also grown because of Ethereum and NFTs. Buyers and sellers can quickly confirm the validity and ownership of artwork because NFTs are recorded on the blockchain, lowering the possibility of fraud or counterfeiting.
Challenges for Traditional Art Institutions
While Ethereum and NFTs have given artists and collectors new opportunities, they have also raised difficulties for established art organizations like museums and galleries. These organizations could find it challenging to keep up with emerging platforms and technologies, as well as the shifting nature of the art world.
New Revenue Streams for Artists
The development of new sources of income for artists is another big effect of Ethereum and NFTs on the art world. Artists used to frequently receive a one-time payment for their work in the past, and they had little choice over how it was utilized or disseminated. NFTs, on the other hand, give artists the chance to keep control of their creations and profit from each sale by taking a cut of the proceeds.
The Impact on The Environment
Concerns regarding the environment have also been expressed about Ethereum and NFTs. Several environmental activists have criticized NFTs since they take a lot of energy to produce and trade, which is why they exist.
The Potential for New Collaborations and Partnerships
New prospects for collaborations and partnerships between artists and other industries, like gaming and virtual reality, have been made possible by Ethereum and NFTs. This could lead to the development of fresh, cutting-edge artistic mediums that straddle traditional and cutting-edge media.
The Need for Education and Awareness
The Blurring of Boundaries Between Art and Technology
Another factor in the blending of the lines between art and technology is Ethereum and NFTs. Software and technology are frequently used to create digital art, and the creation and exchange of NFTs using blockchain technology only serves to emphasize this relationship.
The Potential for Further Innovation
Spark: Lighting A Fire Under Hadoop
Also see: Hadoop and Big Data
Hadoop has come a long way since its introduction as an open source project from Yahoo. It is moving into production from pilot/test stages at many firms. And the ecosystem of companies supporting it in one way or another is growing daily.
It has some flaws, however, that are hampering the kinds of Big Data projects people can do with it. The Hadoop ecosystem uses a specialized distributed storage file system, called HDFS, to store large files across multiple servers and keep track of everything.
While this helps managed the terabytes of data, processing data at the speed of hard drives makes it prohibitively slow for handling anything exceptionally large or anything in real-time. Unless you were prepared to go to an all-SSD array – and who has that kind of money? – you were at the mercy of your 7,200 RPM hard drives.
The power of Hadoop is all centered around distributed computing, but Hadoop has primarily been used for batch processing. It uses the framework MapReduce to execute a batch process, oftentimes overnight, to get your answer. Because of this slow process, Big Data might have promised real-time analytics but it often couldn’t deliver.
Enter Spark. It moved the processing part of MapReduce to memory, giving Hadoop a massive speed boost. Developers claim it runs Hadoop up to 100 times faster in certain applications, and in the process opens up Hadoop to many more Big Data types of projects, due to the speed and potential for real-time processing.
Spark started as a project in the University of California, Berkeley AMPLab in 2009 and was donated as an open source project to the Apache Foundation in 2012. A company was spun out of AMPLab, called Databricks, to lead development of Spark.
Patrick Wendell, co-founder and engineering manager at Databricks, was a part of the team that made Spark at Berkeley. He says that Spark was focused on three things:
1) Speed: MapReduce was based on an old Google technology and is disk-based, while Spark runs in memory.
2) Ease of use: “MapReduce was really hard to program. Very few people wrote programs against it. Developers spent so much time trying to write their program in MapReduce and it was huge waste of time. Spark has a developer-friendly API,” he said. It supports eight different languages, including Phython, Java, and R.
3) Make something broadly compatible: Spark can run on Amazon EC2, Apache’s Mesos, and various cloud environments. It can read and write data to a variety of databases, like PostgreSQL, Oracle, MySQL and all Hadoop file formats.
“Many people have moved to Spark because they are performance-sensitive and time is money for them,” said Wendell. “So this is a key selling point. A lot of original Hadoop code was focused on off line batch processing, often run overnight. There, latency and performance don’t matter much.”
Because Spark is not a storage system, you can use your existing storage network and Spark will plug right into Hadoop and get going. Governance and security is taken care of. “We just speed up the actual crunching of what you are trying to do,” said Wendell. Of course, that’s also predicated on giving your distributed servers all the memory they will need to run everything in memory.
Prakash Nanduri, CEO of the analytics firm Paxata, said that Spark made Hadoop feasible for working in real time. “Now you have the ability to focus at real-time analytics as scale. The huge implication is suddenly you go from 10 use cases to 100 use cases and do it at a cost that is significantly lower than for traditional interactive analytic use cases,” he said.
Many of the cloud vendors that offer some kind of Hadoop solution, like Cloudera, Hortonworks, and MapR, are bundling Spark with Hadoop as a standard offering now, said Wendell.
At a recent Spark Summit, Toyota Motor offered an example of the speed Spark offers. It uses social media to watch for repair issues in addition to customer inquiries. The problem with the latter is people don’t care about surveys, so it shifted its emphasis to Twitter and Facebook. The company built an entire system on Spark to monitor social media to watch for keywords.
Its original customer experience app, done as a regular Hadoop batch job, would take 160 hours, or 6 days. The same job rewritten for Spark is completed in just four hours. The company also parsed the flood of input from social media and was able to filter out things like dealer promos, irrelevant material and incident reports involving Toyota products and reduced the amount of data to process by 50%.
Another use case is log processing and fraud detection, where speed is of the utmost, as banks, businesses and other financial and sales institutions need to move fast to catch fraudulent activity and act on the warnings.
“The business value you achieve is fundamentally derived through the apps. In the case of financial services, you need to be able to detect money laundering cases. You cannot find money laundering signals by running a batch process at night, it has to be in real time,” said Nanduri. “An app built on Spark can do the entire data set in real time and interactive speeds and get to the answer much faster.”
But Spark isn’t just about in-memory processing. Wendell said half of the performance gains come from running in memory and other half is from optimizations. “The other systems weren’t designed for latency so we improved on that a lot,” he said.
There is still more work to be done. Wendell said there is a big initiative underway with Databricks and Apache to further improve Spark performance, but he would not elaborate.
While it offers a standardized way to build highly distributed and interactive analytical apps, it still has a long way to go,” said Nanduri. “Spark lacks security and needs enhanced support for multiple concurrent users, so there is still some work to do.
Photo courtesy of Shutterstock.
Update the detailed information about Ethereum (Eth) Gains 10% To Spark Altcoin Season Post on the Kientrucdochoi.com website. We hope the article's content will meet your needs, and we will regularly update the information to provide you with the fastest and most accurate information. Have a great day!