-
大小: 42.54MB文件類型: .zip金幣: 1下載: 0 次發布日期: 2023-07-01
- 語言: Python
- 標簽:
資源簡介
Apache Beam: 一個高級且統一的編程模型.讓批處理和流式據處理的作業在任何執行引擎上都可以運行.
代碼片段和文件信息
“““script?to?fix?the?links?in?the?staged?website.
Finds?all?internal?links?which?do?not?have?index.html?at?the?end?and?appends
index.html?in?the?appropriate?place?(preserving?anchors?etc).
Usage:
??From?root?directory?after?running?the?jekyll?build?execute
??‘python?.jenkins/append_index_html_to_internal_links.py‘.
Dependencies:
??beautifulsoup4
??Installable?via?pip?as?‘sudo?pip?install?beautifulsoup4‘?or?apt?via
??‘sudo?apt-get?install?python-beautifulsoup4‘.
“““
import?fnmatch
import?os
import?re
from?bs4?import?BeautifulSoup
#?Original?link?match.?Matches?any?string?which?starts?with?‘/‘?and?doesn‘t
#?have?a?file?extension.
linkMatch?=?r‘^\/(.*\.(?!([^\/]+)$))?[^.]*$‘
#?Regex?which?matches?strings?of?type?/internal/link/#anchor.?Breaks?into?two
#?groups?for?ease?of?inserting?‘index.html‘.
anchorMatch1?=?r‘(.+\/)(#[^\/]+$)‘
#?Regex?which?matches?strings?of?type?/internal/link#anchor.?Breaks?into?two
#?groups?for?ease?of?inserting?‘index.html‘.
anchorMatch2?=?r‘(.+\/[a-zA-Z0-9]+)(#[^\/]+$)‘
matches?=?[]
#?Recursively?walk?content?directory?and?find?all?html?files.
for?root?dirnames?filenames?in?os.walk(‘content‘):
??for?filename?in?fnmatch.filter(filenames?‘*.html‘):
????#?Javadoc?does?not?have?the?index.html?problem?so?omit?it.
????if?‘javadoc‘?not?in?root:
??????matches.append(os.path.join(root?filename))
print?‘Matches:?‘?+?str(len(matches))
#?Iterates?over?each?matched?file?looking?for?link?matches.
for?match?in?matches:
??print?‘Fixing?links?in:?‘?+?match
??mf?=?open(match)
??soup?=?BeautifulSoup(mf)
??#?Iterates?over?every?
??for?a?in?soup.findAll(‘a‘):
????try:
??????hr?=?a[‘href‘]
??????if?re.match(linkMatch?hr)?is?not?None:
????????if?hr.endswith(‘/‘):
??????????#?/internal/link/
??????????a[‘href‘]?=?hr?+?‘index.html‘
????????elif?re.match(anchorMatch1?hr)?is?not?None:
??????????#?/internal/link/#anchor
??????????mat?=?re.match(anchorMatch1?hr)
??????????a[‘href‘]?=?mat.group(1)?+?‘index.html‘?+?mat.group(2)
????????elif?re.match(anchorMatch2?hr)?is?not?None:
??????????#?/internal/link#anchor
??????????mat?=?re.match(anchorMatch2?hr)
??????????a[‘href‘]?=?mat.group(1)?+?‘/index.html‘?+?mat.group(2)
????????else:
??????????#?/internal/link
??????????a[‘href‘]?=?hr?+?‘/index.html‘
????????mf.close()
????????html?=?unicode(soup).encode(‘utf-8‘)
????????#?Write?back?to?the?file.
????????with?open(match?“wb“)?as?f:
??????????print?‘Replacing?‘?+?hr?+?‘?with:?‘?+?a[‘href‘]
??????????f.write(html)
????except?KeyError?as?e:
??????#?Some??tags?don‘t?have?an?href.
??????continue
?屬性????????????大小?????日期????時間???名稱
-----------?---------??----------?-----??----
?????目錄???????????0??2018-11-30?02:32??beam-site-zh-master\
?????文件?????????653??2018-11-30?02:32??beam-site-zh-master\.travis.yml
?????文件???????35141??2018-11-30?02:32??beam-site-zh-master\LICENSE
?????文件????????2283??2018-11-30?02:32??beam-site-zh-master\README.md
?????目錄???????????0??2018-11-30?02:32??beam-site-zh-master\project\
?????文件????????1172??2018-11-30?02:32??beam-site-zh-master\project\translate-flow.md
?????目錄???????????0??2018-11-30?02:32??beam-site-zh-master\site\
?????目錄???????????0??2018-11-30?02:32??beam-site-zh-master\site\en\
?????文件?????????657??2018-11-30?02:32??beam-site-zh-master\site\en\.gitattributes
?????文件?????????117??2018-11-30?02:32??beam-site-zh-master\site\en\.gitignore
?????目錄???????????0??2018-11-30?02:32??beam-site-zh-master\site\en\.jenkins\
?????文件????????2557??2018-11-30?02:32??beam-site-zh-master\site\en\.jenkins\append_index_html_to_internal_li
?????文件?????????472??2018-11-30?02:32??beam-site-zh-master\site\en\Gemfile
?????文件????????1925??2018-11-30?02:32??beam-site-zh-master\site\en\Gemfile.lock
?????文件????????4050??2018-11-30?02:32??beam-site-zh-master\site\en\README.md
?????文件?????????331??2018-11-30?02:32??beam-site-zh-master\site\en\Rakefile
?????文件????????1878??2018-11-30?02:32??beam-site-zh-master\site\en\_config.yml
?????文件?????????109??2018-11-30?02:32??beam-site-zh-master\site\en\_config_test.yml
?????文件????????1886??2018-11-30?02:32??beam-site-zh-master\site\en\run_with_docker.sh
?????目錄???????????0??2018-11-30?02:32??beam-site-zh-master\site\en\src\
?????文件?????????484??2018-11-30?02:32??beam-site-zh-master\site\en\src\.htaccess
?????目錄???????????0??2018-11-30?02:32??beam-site-zh-master\site\en\src\_beam_team\
?????文件????????4741??2018-11-30?02:32??beam-site-zh-master\site\en\src\_beam_team\team.md
?????目錄???????????0??2018-11-30?02:32??beam-site-zh-master\site\en\src\_data\
?????文件????????1013??2018-11-30?02:32??beam-site-zh-master\site\en\src\_data\authors.yml
?????文件???????33956??2018-11-30?02:32??beam-site-zh-master\site\en\src\_data\capability-matrix.yml
?????文件?????????217??2018-11-30?02:32??beam-site-zh-master\site\en\src\_data\logos.yml
?????文件?????????878??2018-11-30?02:32??beam-site-zh-master\site\en\src\_data\meetings.yml
?????目錄???????????0??2018-11-30?02:32??beam-site-zh-master\site\en\src\_includes\
?????文件?????????358??2018-11-30?02:32??beam-site-zh-master\site\en\src\_includes\authors-list.md
?????文件?????????300??2018-11-30?02:32??beam-site-zh-master\site\en\src\_includes\capability-matrix-common.md
............此處省略792個文件信息
評論
共有 條評論