简体   繁体   中英

How put file from local laptop to remote HDFS?

I had Hadoop 2.8.1

Configurated hdfs-site.xml

<configuration>
# Add the following inside the configuration tag
<property>
        <name>dfs.data.dir</name>
        <value>/app/dfs/name/data</value>
        <final>true</final>
</property>
<property>
        <name>dfs.name.dir</name>
        <value>/app/dfs/name</value>
        <final>true</final>
</property>
<property>
        <name>dfs.replication</name>
        <value>1</value>
</property>
<property>
    <name>dfs.webhdfs.enabled</name>
    <value>true</value>
</property>
</configuration>

Did find this code by python

from pywebhdfs.webhdfs import PyWebHdfsClient
from pprint import pprint

hdfs = PyWebHdfsClient(host='hadoop01',port='50070', user_name='hadoop')  # your Namenode IP & username here

my_data = "01010101010101010101010101010101000111 Example DataSet"
my_file = '/examples/myfile.txt'
hdfs.create_file(my_file, my_data.encode('utf-8'))

This variant worked. BUT I want to put already prepared file to remote HDFS.

tried to wrote

with open("C:\\Downloads\\Demographic_Statistics_By_Zip_Code.csv") as file_data:
     print(file_data)

BUT file did not PUT to HDFS. only returned

<_io.TextIOWrapper name='C:\\Downloads\\Demographic_Statistics_By_Zip_Code.csv' mode='r' encoding='cp1251'>

How solve this case?

How about using hdfs cli? Please refer to copyFromLocal or put command from this link:

https://hadoop.apache.org/docs/r2.8.0/hadoop-project-dist/hadoop-common/FileSystemShell.html

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM