I have a directory of files. The majority of them are .patch
files, although some of them are not. For each patch file, I need to scan the first line which is always a string and extract a portion of that line. With a combination of each file's name, that portion of each first line, and a unique identifier for each file, I need to create a hash which I will then convert to json and write to a file.
Here are examples:
|__ .gitkeep
|__ pmet-add-install-module-timings.patch
|__ pmet-change-sample-data-load-order.patch
|__ pmet-stop-catching-sample-data-errrors-during-install.patch
|__ pmet-fix-admin-label-word-breaking.patch
|__ pmet-declare-shipping-factory-for-compilation.patch.disabled
...
File Name: pmet-add-install-module-timings.patch
First Line: diff --git a/setup/src/Magento/Setup/Model/Installer.php b/setup/src/Magento/Setup/Model/Installer.php
File Name: pmet-change-sample-data-load-order.patch
First Line: diff --git a/vendor/magento/module-sample-data/etc/module.xml b/vendor/magento/module-sample-data/etc/module.xml
File Name: pmet-stop-catching-sample-data-errrors-during-install.patch
First Line: diff --git a/vendor/magento/framework/Setup/SampleData/Executor.php b/vendor/magento/framework/Setup/SampleData/Executor.php
File Name: pmet-fix-admin-label-word-breaking.patch
First Line: diff --git a/vendor/magento/theme-adminhtml-backend/web/css/styles-old.less b/vendor/magento/theme-adminhtml-backend/web/css/styles-old.less
{
"patches": {
"magento/magento2-base": {
"Patch 1": "m2-hotfixes/pmet-add-install-module-timings.patch"
},
"magento/module-sample-data": {
"Patch 2": "m2-hotfixes/pmet-change-sample-data-load-order.patch"
},
"magento/theme-adminhtml-backend": {
"Patch 3": "m2-hotfixes/pmet-fix-admin-label-word-breaking.patch"
},
"magento/framework": {
"Patch 4": "m2-hotfixes/pmet-stop-catching-sample-data-errrors-during-install.patch"
}
}
}
scrape.rb
Here's the code in its entirety. I will break it down further below:
files_hash = Hash.new
modules = Array.new
data_hash = {
patches: {}
}
files = Dir["*.patch"]
files.each do |file|
files_hash.store(files.index(file), file)
end
files_hash.each do |key, file|
value = File.open(file, &:readline).split('/')[3]
if value.match(/module-/) || value.match(/theme-/)
result = "magento/#{value}"
else
result = "magento2-base"
end
modules << result
modules.each do |val|
data_hash[:patches][val][key] = "m2-hotfixes/#{file}"
end
end
print data_hash
Before I highlight the problem, I need to first detail what I've done to achieve the desired end result:
First, I set up an empty files hash, module array, and data hash:
files_hash = Hash.new
modules = Array.new
data_hash = {
patches: {}
}
Next, I scan the patch file directory for .patch
files and store them in the files hash with their keys. (I figure I can use the keys as the patch labels in the JSON file):
files = Dir["*.patch"]
files.each do |file|
files_hash.store(files.index(file), file)
end
Next, I use the file hash to read the first line of each patch file. I notice a pattern in the patch files which I believe will be reliable: each file has either magento/module-name
, magento/theme-name
or something else. Both the module-name
and theme-name
cases will use the magento/#{value}
syntax. The "something else" case will use magento/magento2-base
:
files_hash.each do |key, file|
value = File.open(file, &:readline).split('/')[3]
if value.match(/module-/) || value.match(/theme-/)
result = "magento/#{value}"
else
result = "magento2-base"
end
modules << result
...
This isn't the most ideal solution (what if the diff structure changes?) but it works for now, and I couldn't quite figure out the proper regex to use to search the strings and return the same result. The above code gives me what I want, which is the following array:
#=>["magento2-base", "magento/module-sample-data", "magento/theme-adminhtml-backend", "magento2-base"]
Next, while still able to access the file names and keys from the file hash, I need to loop through this array and create hashes which have the array values as a key and the file names as values (appended to a file path). Like so, (or so I thought):
modules.each do |val|
data_hash[:patches][val][key] = "m2-hotfixes/#{file}"
end
end
It's this part of the code which I'm having issues with. Running this gives me the following error:
Traceback (most recent call last):
4: from scrape.rb:10:in `<main>'
3: from scrape.rb:10:in `each'
2: from scrape.rb:18:in `block in <main>'
1: from scrape.rb:18:in `each'
scrape.rb:19:in `block (2 levels) in <main>': undefined method `[]=' for nil:NilClass (NoMethodError)
I notice that if I omit key
like so: data_hash
to data_hash[:patches][val]
, I get a hash of values.
So then, my obvious question:
Why doesn't my approach of nesting the hashes one level further using keys work above?
Here's the answer I've come up with:
require 'json'
files_hash = Hash.new
file_hash = Hash.new
result_hash = Hash.new
data_hash = Hash.new
remap_hash = Hash.new
final_hash = {"patches" => {}}
files = Dir["*.patch"]
patches_file = File.dirname(File.expand_path(__FILE__)) + "/patches.json"
files.each do |file|
files_hash.store(files.index(file), file)
end
files_hash.each do |key, file|
value = File.open(file, &:readline).split('/')[3]
if value.match(/module-/) || value.match(/theme-/)
result = "magento/#{value}"
else
result = "magento2-base"
end
data_hash[key] = result
file_hash[key] = "m2-hotfixes/#{file}"
end
data_hash.each do |data_key, data_vals|
file_hash.each do |file_key, file_vals|
if data_key == file_key
remap_hash[data_vals] = {
"Patch #{data_key}" => file_vals
}
end
end
end
final_hash["patches"].merge!(remap_hash)
File.open(patches_file, "w+") do |file|
file.puts(final_hash.to_json)
end
This yields the following: patches.json
NB: I used a different patches file set than above, the key is the formatting here:
{
"patches": {
"magento2-base": {
"Patch 6": "m2-hotfixes/pmet-fix-module-loader-algorithm.patch"
},
"magento/module-downloadable-sample-data": {
"Patch 2": "m2-hotfixes/pmet-fix-sample-data-code-generator.patch"
},
"magento/module-customer": {
"Patch 3": "m2-hotfixes/pmet-visitor-segment.patch"
}
}
}
Some pros and cons:
It works. Awesome.
I know it can be more elegant
One unforeseen need I have run into is the fact that my patches.json
file may have duplicate patches which need to be applied to the same file. Something like:
{
"patches": {
"magento2-base": {
"Patch 1": "m2-hotfixes/pmet-add-install-module-timings.patch"
"Patch 6": "m2-hotfixes/pmet-fix-module-loader-algorithm.patch"
},
"magento/module-downloadable-sample-data": {
"Patch 2": "m2-hotfixes/pmet-fix-sample-data-code-generator.patch"
},
"magento/module-customer": {
"Patch 3": "m2-hotfixes/pmet-visitor-segment.patch"
}
}
}
My solution needs to account for this, since Ruby doesn't allow for duplicate keys in hashes.
I would welcome a more elegant solution to the challenge that considered this quirk.
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.