Big Data Made Simple - One source. Many perspectives.

in #sciencefeed8 years ago

Docker Use Cases – How to handle big data with Docker

Docker containers provide a way to package applications with everything needed to run them, including base operating system images, databases, libraries, and binaries. By running a Docker engine on a host machine, Docker containers interact solely with the kernel of the host OS, meaning all containerized apps function the same regardless of the underlying infrastructure. Furthermore, you can run multiple apps on a single host machine, which leads to impressive cost savings by letting enterprises run more apps on existing hardware.

The statistics for Docker are telling with regards to its popularity and potential:

  • Docker adoption increased by 40 percent between 2016 and 2017, and the latest numbers show that 3.5 million applications have been placed in Docker containers.
  • Research firm 451 Research predicts a compound annual growth in container market revenue of 35 percent until 2021.

The rest of this article will overview some use cases where Docker ties into and helps to handle Big Data sets which are fast-moving, voluminous, and contain a huge variety of information from disparate sources and in different formats. For more info on containers, check out this Docker wiki page.

Docker & Big Data Use Cases

Isolate Big Data Tools

Coupled with the hardware used to set up and manage Big Data clusters are a set of tools that developers and data scientists will use to complete processing jobs or other tasks on Big Data. The problem that often arises is that each developer wants to use their own specific tools to do what they need to do with the data, necessitating the distribution of a whole gamut of tools and their dependencies to each machine within a Big Data cluster.

With a large number of developers, dependency issues will quickly arise, and one tool’s specific requirements can cause another tool to malfunction.

Docker offers a way to overcome these dependency issues by allowing you to build a Big Data ecosystem in which each tool is self-contained, along with all of its dependencies. Developers can use their own tools for different jobs without worrying about conflict with other tools because each tool is isolated within a container.

Run Scheduled Analytics Jobs

A scheduled analytics job is a type of automated data manipulation task that you can run either on a recurring schedule or at a particular time. These types of jobs are very useful for Big Data which inundates organizations at high-velocity, necessitating some form of automation to keep up to speed with tasks. Docker containers can add to the convenience of scheduled jobs by allowing you to run scheduled jobs without manually setting them up on each node in a Big Data cluster.

For example, Chronos is a fault-tolerant job scheduler running on top of Apache Mesos that enables the launching of Docker instances into a Mesos cluster, creating scheduled analytics jobs for those instances. Within Mesos, you can run distributed Big Data applications, such as Hadoop or Spark.

With Chronos and Mesos, your developers or sysadmins can schedule Docker containers to run ETL, batch, and analytics applications on a recurring or time-specific basis, all without the need for any manual setup on cluster nodes.

Aside from the convenience of using Docker for scheduled analytics, the Chronos job scheduler also shows you a job dependency graph to help track dependencies for different jobs.

Provision Big Data Development Environments

The ability to provision a Big Data environment on a local computer is useful for developers who want to learn more about the various technologies and tools needed to become proficient with Big Data ecosystems. After all, within a development context, learning by doing is the best way to gain knowledge. Docker can assist with this by enabling the creation of a multiple-node cluster on a single host machine, replicating the typical Big Data setup.

For example, Ferry is a tool that lets you run multiple container nodes on a single host machine using Docker. This means developers can define, run, and deploy big data stacks using either the human-friendly YAML data serialization standard or JSON. For example, the following code creates a Big Data stack containing a 5-node cluster and a single Linux client to interact with Hadoop:

backend:

– storage:

personality: “hadoop”

instances: 5

layers:

– “hive”

connectors:

– personality: “hadoop-client”

After defining this Big Data stack, you can easily run it in Docker. Start up the Ferry server by running the sudo ferry server command in your Docker terminal, followed by ferry start hadoop.

The ability to provision a Big Data stack locally like this is useful for developers who need a local environment for development purposes, but it’s also good for data scientists who want to experiment with Big Data technologies and further their knowledge.

Build A Big Data Microservices Architecture

Docker facilitates the transition to building a microservices architecture for Big Data applications. Microservices are independant, modular services, and Docker containers provide a natural platform with which to implement such a setup for Big Data apps.

The main benefits of microservices for Big Data include easier application scalability and better quality data. Ingesting Big Data results in many possible points of failure that can lead to lower data quality. With microservices, development teams have an easier job in testing and maintaining services, reducing the chances of poor data quality.

Build A Multi-Cloud Distributed Big Data processing System

The typical drawbacks for companies looking to extract meaningful information from their large data volumes are the need to provision a powerful data processing system and the requirement to install and use complex big data analytics tools.

As described in this paper, a possible use case for Docker is building a Docker container-based big data processing system in multiple clouds for everyone, with the help of the Docker Swarm, which is used to orchestrate containers.

Wrap Up

Docker’s impressive security, performance, and the speed at which you can create multi-node Hadoop clusters make it an ideal fit for use with Big Data workflows. Docker has particular advantages of Big Data ecosystems that use virtual machines because Docker containers are much more lightweight, and they require much less time and effort to set up Hadoop clusters or other Big Data environments.

The post Docker Use Cases – How to handle big data with Docker appeared first on Big Data Made Simple - One source. Many perspectives..




13 simple tips to prevent your online data being throttled

People often choose to keep same id and passwords for all the websites, this is because they don’t want to afford the forgetting problem, but do you know? If you are using same id and password for multiple websites and sharing many posts on social media, then you could be the next target of the cybercriminals.

How can you keep your online data safe?

Worried? Just take a deep breath and relax! We are going to share few tips that can help you in preventing yourself from these fraud, hackers. All you have to do is just go through these dos and don’ts and follow them carefully to maintain your online safety.

1. Never go for an unexpected link

While surfing the social media you could find something that is unexpected, it may be a call for the offer, unwanted advertisement, email or a permission to access your profile. Remember, these types of unwanted stuff are nothing but a trap! By clicking on that link, you allow the cyber criminals to access your online data, in this way they can easily hack and use your personal profile.

2. Use multiple passwords

It’s better to note down your passwords in a diary, instead of creating the same password for different websites. Keep your password smart, don’t act childish by using the name of your beloved belonging as your password. In case anyone of your accounts gets hacked, at least you can secure others from getting in their hand, moreover, retrieval is easier if the password is different and strong.

3. Anti-virus software must be used

It has been witnessed that, those who use a reliable antivirus software reduces the chance of getting their data stolen. By using an antivirus software, they create a barrier for the hackers to access their accounts and personal data.

4. Never hesitate to block

If you feel like that someone has sent you a personal message or request on social media and he is getting too much into your profile, then don’t wait for any mishap, before that in the basis of suspicion, block that person. Remember he is not your boss, to whom you are answerable.

5. Think before sharing any information

Kindly make sure whatever you are sharing n social media could not be misused against you in future, it may be any information, picture comment etc. These dirty players keep a smart eye on your activities and can use that in future to enter your social world with the intentions of fraud or misusing your stuff.

6. Act wisely while online shopping

When you choose to shop something online, remember that you are sharing your card details over there. Kindly keep it in mind that while making an online transaction the most important thing is to check the credibility of the site you are using for your online purchase. It may be a trap for you by these hackers, they could get your card details through those sites and drain your money without your permission.

7. Don’t click on “allow pop-Up”

Whenever we open a new site or download something, we often face a dialogs box in which we are asked to allow the pop-up, not always but, often these pop up have malicious software that can be used against you in verifying your identity of personal data including private information. So, it is better to ignore such pop-up, so that you won’t regret in future.

8. Say no to public WIFI

Who doesn’t like to have a free WIFI, especially when getting bored at a public place? Well, often things are not what they seem to be. These public or free WIFI offerings often contain the virus in them, which once when accessed in your device can share all your data from that device with the cybercriminals. So beware of it and try not to use such WIFI at any cost.

9. Don’t choose the option of remember me

Often when we enter passwords on websites or while making an online transaction when we insert the card details, a dialogs box appears asking you to be remembered in future. It is not necessary that the site is fraudulent, if you choose to be remembered over there, your data could be later used by the hackers if they get access to your device by any means.

10. Go for two-step verifications

Often social media sites provide you the option of two-step verification. In this, they ask you to enter a password and also enter the verification code shared with you by SMS, this can help you in preventing your accounts from getting hacked, because even if the hacker gets the access to your password, he won’t be able to enter the code, moreover you will be notified that someone is trying to access your account.

11. Keep your devices locked

We can understand that unlocking the device every time via pattern, pin or password could be irritating, but this is your need, set the security level of your look screen on high intensity, that means if someone tries to unlock your screen without your permission and enters the wrong data 5-10 times, your device could get refreshed and restore the factory data by wiping out all the data from your device.

12. Log out option is made for your safety

Whenever you are done with the social media sites, log out your account from the device, irrespective of the owner of the device. This could help you in keeping your chats, pictures and other data safe from those who could access that device in future.

13. Don’t trust auction sites always

Particularly those sites which are used for auctions must not be trusted, they ask you for the feedback, or they ask you to share your details so that they can make you the member of their site. But, You must remember that sharing your information could be a great risk, so it is better not to share anything on such sites.

Conclusion

You would wonder to know that, these cybercriminals through their computer scams are costing Britain £27bn per year, other than that unethical use of the data could be a severe danger. After reading these tips we assure you that you can prevent your data from getting hacked, but remember these hackers are smarter than us, so keep yourself active to fight against them as much as possible.

The post 13 simple tips to prevent your online data being throttled appeared first on Big Data Made Simple - One source. Many perspectives..




Decentralised social networks: The choice is yours!

We live in an age, where we fear isolation. The one string which connects us with the rest of the world is social media. With people being constantly on the move, social networking websites provide the most accessible and affordable medium for news and entertainment. The widespread popularity of social media, and the vague regulations in place, have allowed several media giants to rise. Companies such as Facebook, Twitter and LinkedIn are big wigs who dominate and monopolize the social networking industry.

What’s more, these organisations have access to unbelievably large amounts of data. And with every passing day large amounts of data are uploaded and shared on social media channels. They monetize this information by selling it to advertising and marketing companies to create targeted campaigns. Which is not new information.

However, these social media giants have not only dominated the industry, but they have also set the norms for setting the frameworks for social media websites. For example, the server(s) which host the website fall under one authority, i.e. the creator (and owning) organisation. Moreover, the source code for the website is closed and inaccessible to anyone but the relevant personnel of the company. They control every single aspect of the website. And the most troubling aspect of this model, in the amount of control these organizations have over the content which shared on their feed. The companies have complete authority to censor information as well as promote certain content.

But after a series of data leak scares over the recent years, people are questioning the methods in which these monolith organisations operate. And are looking for alternatives which better facilitate the users’ basic internet rights.

One of those alternatives is a decentralised /distributed social network (DSN) or a federated social network.

The idea of DSNs cropped as a response to the blatant (and sometimes unethical) data mining which many social media networks undertake. The concept gained further noticeability when cryptocurrencies gained popularity. Today, millions of people are active users of at least one DSN. And while that may not compare to the billions of mainstream social media users, it still is a considerably big user base. Some of the more popular DSNs today are Mastodon, Diaspora*, Sphere, Obsidian and Steemit.

A DSN works on a very different ideology to mainstream social networking websites. Most of them follow three basic principles – data security, privacy and transparency. Unlike Facebook and Twitter, DSNs are hosted on multiple servers owned by different people. And since these websites are usually open sourced, anyone can download the code and tweak it to create their own network. Or improve the existing one. So, instead of operating on a mediated private server, DSNs work based on peer-to-peer interaction.

Moreover, most DSNs offer encrypted messaging services and the option of anonymity to its users. In fact, some DSNs do not ask for proof of identification like a phone number while signing up. Strong advocates of this alternative form of social networking emphasize that this allows the user to be completely in control of their own profile. And the content shared on it. They can control what they see, what to show and who to show the content to. Websites like Mastodon also refuse to host paid advertising in its platform. Hence users would get to see genuine content instead of sponsored campaigns. On the other hand, some websites like Steemit and Sphere allow users to monetize their content, by utilizing the concept of cryptocurrency, or ‘tokens’.

However, while the idea of DSNs is a good one, it is not a fool-proof method to combat mainstream social media. There are still several challenges and disadvantages to using a DSN account.

The most obvious challenge is attracting permanent user base. Despite the already existing user count being more than a million, DSN is still relatively unheard of. People prefer to use main-stream social media like Facebook and Twitter because they are super easy to use. DSNs on the other hand, can be difficult for newbie users to navigate through.

Moreover, not everyone is interested to host a web server on their computer. And this further drives the average internet user away from signing up on a DSN.

Another pretty serious challenge which creators of DSNs face is security. Whether they agree or not. The fact of the matter is that most DSNs do not ask for real world identity proofs. In turn they rely on public key cryptography to enforce security for their user accounts. However, this extremely difficult to manage. Not to mention, the basic issues that plague social media – like fake accounts, incorrect information sources, fake news, echo chambers and filter bubbles, persist.

In conclusion, the concept of DSNs is very promising. The intentions of its creators to change the internet back into an open free web is noble. And it is good to know that there are, alternatives to Facebook and Twitter. However, there remains a lot to be done before a DSN can be considered as fool proof replacement to mainstream social media sites.

So, which one will you choose? Mainstream social media, or a decentralised one?

The post Decentralised social networks: The choice is yours! appeared first on Big Data Made Simple - One source. Many perspectives..




You should start using AI chatbots. Here is why!

The shift towards artificial intelligence (AI) and machine learning is all around. A surprising 80% of enterprises already invest in some form of AI today. Business communication tools are no exception. Both well-known and fresh chat apps like Facebook Messenger or Chanty start to actively come up with AI-powered features. Along with this trend, chatbots became the hot topic of last few years.

In a nutshell, chatbots are any bots that live in chat platforms. Unlike humans, these conversational agents are available for work 24/7. This feature turns chatbots into ideal tools for delivering information services. No wonder people started to actively harness them in different industry fields (like banking or publishing) and on e-commerce websites.

The invasion of chatbots in the messaging app industry has also unlocked a new gate for the way people collaborate at work. Timely team messengers became not only a place to exchange information, but also powerful assistants. How exactly can any team take advantage of using a team chat app with chatbots? Below are the main points to consider.

1. Easy task and project management

Ever-evolving capabilities of chatbots help team messengers ease up the entire project management process. Employees no longer need to leave their team chat app to make reports, as well as track time and expense. For example, Busybot lets users create and assign tasks right in Slack. Talla, in turn, keeps everybody focused on the critical tasks with alerts and reminders.

2. Automated routine processes

Frankly speaking, we all hate to do the same work all the time. Nevertheless, office workers spend about 552 hours a year doing work they did before. The possibility to cut down daily tasks off our shoulders sounds tempting, isn’t it? Chatbots are here to fight the working routine. With their help, employees can concentrate only on significant tasks, saving time and improving work efficiency.

3. Single information center

Employees’ productivity majorly depends on the one aspect: speed. Modern working process suffers as we lose time juggling between different online tools about 300 times per day. Chatbots integrated with a team messenger and other apps fix this problem. They pull information from all third-party tools and gather insights in your team chat application.

4. Smarter customer service

According to Gartner, chatbots will power 85% of all customer service interactions by the year 2020. And, why not? Chatbots like Twyla or Clare.AI improve any existing helpdesk or live chat support, resolving customers’ issues at a blink of an eye.

5. Saving costs

CNBC states: “Chatbots currently account for business cost savings of $20 million globally”. Of course, these virtual agents work round the clock and don’t ask for sick leaves, vacations and days off. With their ability to multitask, chatbots interact with several people at once and slowly replace real employees. Besides, a huge amount of platforms for building chatbots will let you create an assistant for any business needs.

To sum up, we have currently touched five main benefits of integrating chatbots into team communication tools. Virtual agents help messengers serve as a single information center and become awesome co-pilots in managing projects or dealing with repetitive tasks. Moreover, their 24/7 accessibility contributes to instant customer support and saving money by answering more questions in less time without the help of humans involved.

Once you choose the right chatbots for your team messenger, the company will not only improve internal communication, but also start working more efficiently.

The post You should start using AI chatbots. Here is why! appeared first on Big Data Made Simple - One source. Many perspectives..




Comparing social media security: How do they protect your information?

How is your information protected on social media? Is it protected? These are the questions we’re asking ourselves more and more, especially in the light of high-profile privacy scandals that demonstrate how vulnerable our data really is.

In fact, recent research from the Pew Research Center shows that the majority of Americans don’t trust social media sites; according to their “Americans and Cybersecurity” study, 51 percent of respondents said they’re not confident in the ability of social media websites to protect their data.

Despite waning trust and publicized breaches, we’re still logging in — and giving our information away. To better understand just what data we’re sending to social media providers, and determine how they keep our personal data safe, Varonis looked at the security blogs of three popular social sites: Facebook, LinkedIn, and Twitter. Check out the full infographic on social media security below.

SOURCE

The post Comparing social media security: How do they protect your information? appeared first on Big Data Made Simple - One source. Many perspectives..




Net Neutrality: 5 things you need to know before you join the debate

In the wake of the Federal Communications Commission’s announcement to repeal the 2015 net neutrality laws in the United States, there has been an air of general confusion and disappointment among the American public. Ajit Pai’s stubborn determination to completely undo the previous laws regulating internet providers, has sparked a secondary round of debate. One which was silent ever since 2015, when the laws first came into existence.

But despite the constant buzz around net neutrality, not many people have a clear understanding of it. And there are many varied views on the topic flying around. Considering the extent of divide between the advocates of net neutrality and its opposing force, it’s not surprising that the average individual would be confused. So, whether you decide to support net neutrality or not, here are a few facts to bear in mind before you jump into the argument.

1. Net neutrality is a principle, not a law. Much like the concept of freedom of speech, it is a fundamental right of internet users. Net neutrality is basically the principle of open and fair internet.

2.Ensures that Internet Service Providers (ISPs) treat online data equally. Without any discrimination between user, content, website, platform, application, type of attached equipment or method of communication. Consumers can choose the digital content they prefer to see, without the broadband providers limiting the options available to them or discriminating between certain content providers.

3.ISPs have divided opinions on net neutrality. Large ISPs, especially those in the U.S. strongly oppose the concept of treating all data on the internet equally. Companies like Verizon, Comcast and AT&T argue that strong internet regulation could negatively effect business for small and new enterprises. As well as wipe-out new competition. While bigger organisations like Google, Netflix and Amazon would be able to survive despite the regulations, they believe that net neutrality could curb innovation. Especially for smaller enterprises. However, this is definitely not the opinion of all the broadband providers. A group of smaller and local ISPs have joined together and have filed a lawsuit against the American government and FCC in a bid to retain the previous net neutrality laws. They believe that FCC’s repeal would benefit only the mega-established broadband providers. And adversely affect consumers, content providers and smaller ISPs.

4. Enterprises and websites are for net neutrality. Sharing a similar opinion as the small-time ISPs, tech, media and e-commerce establishments and websites believe that net neutrality is an absolute necessity. Of course, if broadband companies begin charging content providers more in exchange for better services, then already established giants would easily be able to pay the price by simply charging their customers more. However, smaller websites and companies, would not be able to do so. While giants like Google, Amazon, Facebook and Twitter have verbally displayed their support for an open and fair internet, they have remained somewhat aloof in their approach to the issue this time around. However, websites such as Tumblr, Reddit, Etsy and thousands more have decided to take a more active stance against FCC’s repeal. They have decided to be part of a campaign scheduled to take place on May 9th, ahead of the Senate vote.

5. Different countries have different approaches to net neutrality and data protection. Countries like Brazil and Portugal (and the United States before the controversial repeal) banned throttling and blocking of data/websites, but not zero-rating. In Japan, the government follows a fairly ‘hands-off’ approach to net neutrality, as the industry itself obeys voluntary self-regulatory measures. And in Australia there are no net neutrality laws in place. Instead, they have pretty strong consumer protection laws, which focus heavily on transparency on the ISPs’ half. Similarly, in India, the newly placed internet regulations focus on complete transparency.

The post Net Neutrality: 5 things you need to know before you join the debate appeared first on Big Data Made Simple - One source. Many perspectives..




Source: http://bigdata-madesimple.com/
Sort:  

This user is on the @buildawhale blacklist for one or more of the following reasons:

  • Spam
  • Plagiarism
  • Scam or Fraud

Brother, thank you so much for sharing this kind of post among us.

You got a 6.70% upvote from @redlambo courtesy of @sciencefeed! Make sure to use tag #redlambo to be considered for the curation post!

Coin Marketplace

STEEM 0.04
TRX 0.32
JST 0.082
BTC 61289.85
ETH 1583.19
USDT 1.00
SBD 0.47