繁体   English   中英

使用 shell 脚本将特定单词从一个文件复制到另一个文件

[英]Copy specific word from a file to another file using shell script

我是 shell 脚本的新手。 我的文件夹结构类似于以下格式,因为每个文件夹都有一个文件,文件名是 note.json,所以我想从 note.json 复制特定的词,例如“用户”,我对单个文件进行了尝试,它正在工作,但是显示不必要的数据,而且我需要循环格式(意味着去每个文件夹都做同样的事情)任何人都可以帮助我吗?

我的文件夹结构:

drwxr-xr-x   - zeppelin hdfs          0 2020-06-01 16:20 /user/zeppelin/notebook/2FBC2M3K2
drwxr-xr-x   - zeppelin hdfs          0 2020-05-20 18:01 /user/zeppelin/notebook/2FBDEKUGP
drwxr-xr-x   - zeppelin hdfs          0 2020-05-26 20:32 /user/zeppelin/notebook/2FBDXNZRC
drwxr-xr-x   - zeppelin hdfs          0 2020-05-26 21:00 /user/zeppelin/notebook/2FBEAGZEE
drwxr-xr-x   - zeppelin hdfs          0 2020-05-25 14:18 /user/zeppelin/notebook/2FBGXSHZR
drwxr-xr-x   - zeppelin hdfs          0 2020-05-20 14:31 /user/zeppelin/notebook/2FBHCNKJP
drwxr-xr-x   - zeppelin hdfs          0 2020-06-02 17:34 /user/zeppelin/notebook/2FBJCZ212

我尝试使用以下命令创建单个文件夹,

$ cat note.json | grep "user"
"user": "Ayan.Paul",
            "data": "org.apache.hive.service.cli.HiveSQLException: Error while compiling statement: FAILED: HiveAccessControlException Permission denied: user [Ayan.Paul] does not have [USE] privilege on [snt_mmedata_upload_prd]\n\tat org.apache.hive.jdbc.Utils.verifySuccess(Utils.java:300)\n\tat org.apache.hive.jdbc.Utils.verifySuccessWithInfo(Utils.java:286)\n\tat org.apache.hive.jdbc.HiveStatement.runAsyncOnServer(HiveStatement.java:324)\n\tat org.apache.hive.jdbc.HiveStatement.execute(HiveStatement.java:265)\n\tat org.apache.commons.dbcp2.DelegatingStatement.execute(DelegatingStatement.java:291)\n\tat org.apache.commons.dbcp2.DelegatingStatement.execute(DelegatingStatement.java:291)\n\tat org.apache.zeppelin.jdbc.JDBCInterpreter.executeSql(JDBCInterpreter.java:718)\n\tat org.apache.zeppelin.jdbc.JDBCInterpreter.interpret(JDBCInterpreter.java:801)\n\tat org.apache.zeppelin.interpreter.LazyOpenInterpreter.interpret(LazyOpenInterpreter.java:103)\n\tat org.apache.zeppelin.interpreter.remote.RemoteInterpreterServer$InterpretJob.jobRun(RemoteInterpreterServer.java:633)\n\tat org.apache.zeppelin.scheduler.Job.run(Job.java:188)\n\tat org.apache.zeppelin.scheduler.ParallelScheduler$JobRunner.run(ParallelScheduler.java:162)\n\tat java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)\n\tat java.util.concurrent.FutureTask.run(FutureTask.java:266)\n\tat java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$201(ScheduledThreadPoolExecutor.java:180)\n\tat java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:293)\n\tat java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)\n\tat java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)\n\tat java.lang.Thread.run(Thread.java:745)\nCaused by: org.apache.hive.service.cli.HiveSQLException: Error while compiling statement: FAILED: HiveAccessControlException Permission denied: user [Ayan.Paul] does not have [USE] privilege on [snt_mmedata_upload_prd]\n\tat org.apache.hive.service.cli.operation.Operation.toSQLException(Operation.java:335)\n\tat org.apache.hive.service.cli.operation.SQLOperation.prepare(SQLOperation.java:199)\n\tat org.apache.hive.service.cli.operation.SQLOperation.runInternal(SQLOperation.java:262)\n\tat org.apache.hive.service.cli.operation.Operation.run(Operation.java:247)\n\tat org.apache.hive.service.cli.session.HiveSessionImpl.executeStatementInternal(HiveSessionImpl.java:541)\n\tat org.apache.hive.service.cli.session.HiveSessionImpl.executeStatementAsync(HiveSessionImpl.java:527)\n\tat org.apache.hive.service.cli.CLIService.executeStatementAsync(CLIService.java:315)\n\tat org.apache.hive.service.cli.thrift.ThriftCLIService.ExecuteStatement(ThriftCLIService.java:562)\n\tat org.apache.hive.service.rpc.thrift.TCLIService$Processor$ExecuteStatement.getResult(TCLIService.java:1557)\n\tat org.apache.hive.service.rpc.thrift.TCLIService$Processor$ExecuteStatement.getResult(TCLIService.java:1542)\n\tat org.apache.thrift.ProcessFunction.process(ProcessFunction.java:39)\n\tat org.apache.thrift.TBaseProcessor.process(TBaseProcessor.java:39)\n\tat org.apache.thrift.server.TServlet.doPost(TServlet.java:83)\n\tat org.apache.hive.service.cli.thrift.ThriftHttpServlet.doPost(ThriftHttpServlet.java:208)\n\tat javax.servlet.http.HttpServlet.service(HttpServlet.java:707)\n\tat javax.servlet.http.HttpServlet.service(HttpServlet.java:790)\n\tat org.eclipse.jetty.servlet.ServletHolder.handle(ServletHolder.java:848)\n\tat org.eclipse.jetty.servlet.ServletHandler.doHandle(ServletHandler.java:584)\n\tat org.eclipse.jetty.server.session.SessionHandler.doHandle(SessionHandler.java:224)\n\tat org.eclipse.jetty.server.handler.ContextHandler.doHandle(ContextHandler.java:1180)\n\tat org.eclipse.jetty.servlet.ServletHandler.doScope(ServletHandler.java:512)\n\tat org.eclipse.jetty.server.session.SessionHandler.doScope(SessionHandler.java:185)\n\tat org.eclipse.jetty.server.handler.ContextHandler.doScope(ContextHandler.java:1112)\n\tat org.eclipse.jetty.server.handler.ScopedHandler.handle(ScopedHandler.java:141)\n\tat org.eclipse.jetty.server.handler.gzip.GzipHandler.handle(GzipHandler.java:493)\n\tat org.eclipse.jetty.server.handler.HandlerWrapper.handle(HandlerWrapper.java:134)\n\tat org.eclipse.jetty.server.Server.handle(Server.java:534)\n\tat org.eclipse.jetty.server.HttpChannel.handle(HttpChannel.java:320)\n\tat org.eclipse.jetty.server.HttpConnection.onFillable(HttpConnection.java:251)\n\tat org.eclipse.jetty.io.AbstractConnection$ReadCallback.succeeded(AbstractConnection.java:283)\n\tat org.eclipse.jetty.io.FillInterest.fillable(FillInterest.java:108)\n\tat org.eclipse.jetty.io.SelectChannelEndPoint$2.run(SelectChannelEndPoint.java:93)\n\tat org.eclipse.jetty.util.thread.strategy.ExecuteProduceConsume.executeProduceConsume(ExecuteProduceConsume.java:303)\n\tat org.eclipse.jetty.util.thread.strategy.ExecuteProduceConsume.produceConsume(ExecuteProduceConsume.java:148)\n\tat org.eclipse.jetty.util.thread.strategy.ExecuteProduceConsume.run(ExecuteProduceConsume.java:136)\n\t... 3 more\nCaused by: java.lang.RuntimeException: org.apache.hadoop.hive.ql.security.authorization.plugin.HiveAccessControlException:Permission denied: user [Ayan.Paul] does not have [USE] privilege on [snt_mmedata_upload_prd]\n\tat org.apache.ranger.authorization.hive.authorizer.RangerHiveAuthorizer.checkPrivileges(RangerHiveAuthorizer.java:483)\n\tat org.apache.hadoop.hive.ql.Driver.doAuthorizationV2(Driver.java:1330)\n\tat org.apache.hadoop.hive.ql.Driver.doAuthorization(Driver.java:1094)\n\tat org.apache.hadoop.hive.ql.Driver.compile(Driver.java:705)\n\tat org.apache.hadoop.hive.ql.Driver.compileInternal(Driver.java:1863)\n\tat org.apache.hadoop.hive.ql.Driver.compileAndRespond(Driver.java:1810)\n\tat org.apache.hadoop.hive.ql.Driver.compileAndRespond(Driver.java:1805)\n\tat org.apache.hadoop.hive.ql.reexec.ReExecDriver.compileAndRespond(ReExecDriver.java:126)\n\

如上所述,如果是 json 结构化,最好和干净的方法是使用 jq。 否则,如果这条线始终保持不变,您可以尝试:

cat note.json | grep "\"user\":" | sed 's/\"//g' | sed 's/,//g' | sed 's/ //g'

在哪里

grep "\"user\":" - 将采用您想要的路线

cut -d":" -f2 - 将从第二列中取出“:”分隔符

sed 's/\"//g' - 删除 "

sed 's/,//g' - 删除逗号

sed 's/ //g' - 将删除空格以防万一(您不必使用它)

如果您需要循环,可以说:

folder_Path='/path/to/myfolder'

files_in_folder=$(ls ${folder_Path})
for file in ${files_in_folder}
do
    if [[ ${file} == "note.json" ]]
    then
        cat ${file} | grep "\"user\":" | sed 's/\"//g' | sed 's/,//g' | sed 's/ //g' > ${new_file_path}
    fi

如果您知道note.json文件的行首总是有“用户”,那么您可以使用 grep。 听起来您也想要“用户” JSON 字段的 尝试使用jq来解析它。 下面是去除多余字符的“廉价和肮脏”的方式。 (我们将坚持使用循环,因为您可能正在为每个文件做其他事情......)

for file in $(find . -name note.json); do
    grep "^.user" $file | cut -c 10- | tr -d '",'
done

如果您在使用jq解析 JSON 方面需要帮助,只需问一个显示“note.json”文件的不同问题并尝试解析它!

暂无
暂无

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM