简体   繁体   English

REST API与Squoop

[英]Rest api vs sqoop

I was trying to import data from mysql to hdfs . 我试图将数据从mysql导入到hdfs I was able to do it with sqoop but this can be done by fetching the data from api also. 我可以用sqoop做到这一点,但这也可以通过从api获取数据来完成。

My question is about when to use rest api to load data in hdfs instead of sqoop ? 我的问题是关于何时使用rest api将数据加载到hdfs中而不是sqoop

Please specify some difference with use cases! 请说明一些用例的区别!

Sqoop (SQL <=> Hadoop) is basically used for loading data from RDBMS to HDFS . Sqoop (SQL <=> Hadoop)基本上用于将数据从RDBMS加载到HDFS

It's a direct connection to database where you can append/modify/delete data in table(s) using sqoop eval command if privileges are not defined properly for the user accessing the db from sqoop 它是与数据库的直接连接,如果未正确定义用户从sqoop访问db的特权,则可以使用sqoop eval命令在其中添加/修改/删除表中的数据。

But using Rest web services api we can fetch data from various databases ( can be NoSQL or RDBMS both ) connected internally via code. 但是使用Rest Web服务api,我们可以从通过代码内部连接的各种数据库( 可以是NoSQL或RDBMS两者 )中获取数据。

Consider you are calling a getUsersData restful web service using curl command which is specifically designed only to provide users data and doesn't allow to append/modify/update any components of db irrespective of database (RDBMS/NoSQL) 考虑您正在使用curl命令调用getUsersData宁静的Web服务,该命令专门设计用于提供用户数据,并且不允许附加/修改/更新db的任何组件,而与数据库无关(RDBMS / NoSQL)

You could use Sqoop to pull data from Mysql and into Hbase, then put a REST API over Hbase (on Hadoop)... Would be not much different than a REST API over Mysql. 您可以使用Sqoop将数据从Mysql提取到Hbase中,然后将REST API放在Hbase之上(在Hadoop上)……与通过Mysql的REST API没什么不同。

Basically, you're comparing two different things. 基本上,您是在比较两个不同的东西。 Hadoop is not meant to replace traditional databases or N-tier user-facing applications, it just is a more distributed, fault tolerant place to store large amounts of data. Hadoop并不是要取代传统的数据库或N层面向用户的应用程序,它只是一个分布更广,容错能力强的地方,用于存储大量数据。

And you typically wouldn't use a REST API to talk to a database, then put those values into Hadoop, because that wouldn't be distributed and all database results go through a single process 通常,您通常不使用REST API与数据库进行通信,然后将这些值放入Hadoop,因为它将不会被分发,并且所有数据库结果都将通过单个过程进行处理

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM