简体   繁体   中英

Stripping filename from HTML Encoded UNIX Path

I am writing a NodeJS application using Express and Google Datastore. I am trying to get the filename from a UNIX path. The path is stored in an HTML encoded format in the database.

Here's the path un-encoded:

/toplevel/example/text123.txt

Here's how the path is stored in the database HTML encoded format:

/toplevel/example/test123.txt

Since the path is HTML encoded, this line is not working.

let filename_only = requested_filepath_unescaped.split('/').pop().toString();

I also tried splitting by the encoded characters but that does not work either (perhaps because split doesn't work with multiple characters?)

let filename_only = requested_filepath_unescaped.split('&#x2F').pop().toString();

What is the best way to either split the string as-is, or de-code the HTML back into an unencoded string?

Well, split works with multiple characters, so I don't know what goes wrong when you tried it.

However if you can use jQuery, you can also decode the html like this:

var htmlDecoded = $('<div />').html(htmlEncoded).text()

After that you can split on '/'.

(The code I gave creates a div tag in memory (it is not added to the DOM, the web page), after that it sets the html of it, which automatically decodes the html entities.

EDIT: As I am unsure what the problem of the OP is and I can't comment due to low reputation, I give some more suggestions here.

Maybe the variable you call split on is not really a string object. Try converting to string first:

var filename = filepath.toString().split('&#x2F;');

Other option is to use regex, but I don't know what exactly solves that, but might be worth trying.

var filename = filepath.toString().split(/&#2F;/);

EDIT2: Tested and working in Chrome v62 and Node v6.11.4.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM