1
pkuphy 2020-06-30 22:35:19 +08:00 1
beautifulsoup4
|
2
shakespark 2020-06-30 22:37:04 +08:00 1
直接字符串查找<strong>也不难吧
|
3
cx524541577 OP @pkuphy 看了一下不太懂 还是谢谢大佬,我再研究一下
|
4
cx524541577 OP @shakespark 大概里面有几百个词,一个一个复制得很长时间了,不过还是感谢大佬回复
|
5
V2tizen 2020-06-30 22:44:49 +08:00 1
<strong>([\s\S]*?)</strong> 正则这么写试试
|
6
izoabr 2020-06-30 22:45:10 +08:00 2
import re
help(re) |
7
loliordie 2020-06-30 22:45:30 +08:00 1
beautifulsoup 了解一下
|
8
nomoon 2020-06-30 22:46:09 +08:00 1
regex group ?
|
9
JamesMackerel 2020-06-30 22:47:47 +08:00 via iPhone
50 块,十分钟。
|
10
linvaux 2020-06-30 22:49:46 +08:00 1
bs4, re, lxml 来,选一个
|
11
limuyan44 2020-06-30 22:50:57 +08:00 1
A-Type Plug
A-Weighting AC Accent Mic Acoustic Foam Acoustic Treatment Active Active Loudspeaker or Monitor A/D [A-D] Converter ADAT Lightpipe Additive Synthesis ADSR Active Sensing AES AES3 AES10 AES11 AES17 AES42 AES59 AFL Aftertouch Algorithm Aliasing Ambience Amp (Ampere) Amp/Amplifier Amplitude Analogue (cf. Digital) Analogue Synthesis Anti-alias Filter Application (App) Arming Arpeggiator ASCII Attack Attenuate Audio Data Reduction Audio Frequency Audio Interface Audio Random Access (ARA) Audio Scrubbing Autolocator Auxiliary Sends (Auxes) Aux Return Azimuth B-Type Plug Backup Back Electret Balance Balanced Wiring Band-pass Filter (BPF) Bandwidth Bank Bass Response Bass Tip-up Bass Trap Bantam Plug Beta Version Bias Binary BIOS Bit Bit Rate (see also Sample Rate) Bi-Timbral Blumlein Array BNC Boom Boost/Cut Control Booth Isolation Room Bouncing Boundary Boundary Layer Microphone BPM Breath Controller Buffer Memory Bug Bus buss Byte C-Weighting Cabinet Cabinet Resonance Capacitor Capacitor Microphone Capsule Carbon Microphone Cardioid CD-R CD-R Burner Channel Chase Chip Chord Chorus Chromatic Click Track Clipping Clocking Clone Close-Miking Cloud Codec co dec Coincident Colouration Comb-Filter Common Mode Rejection Compact Cassette Compander Comping Compressor Computer Condenser Microphone Conductor Cone Console Contact Cleaner Continuous Controller Control Voltage Converter Convolution Convolution Reverb Copy Protection CPU Crash Crossover Crossover frequency Cut and Paste Editing Cut-off Frequency Cycle CV Daisy Chain Damping DANTE DAT Data DAW dB dB/Octave DC DCA DCA Group DCC DBX DCO DDL DDP De-emphasis De-esser De-Oxidising Compound Decay Decca Tree Decibel Decoupler (also isolator) Defragment Delay Desk Detent DI Diaphragm DI Box Digital (cf. Analogue) Digital Delay Digital Reverberator DIN Connector Diode-Bridge Compressor Direct Coupling Dither Disc Disk DMA Dolby Noise-Reduction Dolby Surround-Sound Dolby HX DOS Dome Double-ended Noise Reduction Double-lapped Screen DSP Drive unit Driver Dropout Drum Pad Drum Booth Isolation Room Dry (cf. Wet) Dubbing Ducking Dump DVS Dynamic Microphone Dynamic Range Dynamics eSATA Early Reflections Effect Effects Loop Effects Return Electret Microphone Encode/Decode Enhancer (cf. Exciter) Envelope Envelope generator E-PROM Equaliser (cf. Filter) Equivalent Input Noise Erase EuCon Eurorack Event Exciter (cf. Enhancer) Expander Expander Module Fader Ferric FET FET-Compressor Fidelity Figure of Eight File Filter (cf. Equaliser) Filter Frequency FireWire Flanging Flash Drive (see 'solid-state drive') Floppy Disk Flutter Flutter Echoes Foldback Formant Format Fragmentation (cf. defragment) Frequency Frequency Response FSK Fukada Tree Fundamental FX Gain Gain Staging Galvanic Isolation Gate (CV) Gate General MIDI Glitch GM Reset Gooseneck Swan Neck Graphic Equaliser Ground Ground Loop / Ground Loop Hum Group GS GUI Hard Disk Drive (cf. Solid-state Drive) Harmonic Harmonic Distortion Head Headroom Hertz (Hz) High-Pass Filter (HPF) High-range (highs) High Resolution Hiss Hub Hum Hysteresis Hz IC IEM Impedance Impulse Response Inductor Initialise Insert Points Input Impedance Insulator Instrument Level Interface Intermittent Intermodulation Distortion I/O IPS IRQ Isolation Room spill Isolator (also decoupler) Isopropyl Alcohol Jackfield Jack Plug Jargon Jog Wheel k K-Metering K-Weighting Latency (cf. Delay) Lay Length LED LCD LFO LSB Lightpipe Limiter Linear Line-level LKFS LUFS Load Local On/Off Logic Loom Loop Low Frequency Oscillator (LFO) Low-Pass Filter (LPF) Loudspeaker (also Monitor and Speaker) Loudness Loudness-Normalisation Loudness Wars Low-range (low, lows) LUFS L U F S LKFS K m M MADI Magnetic Shielding Master Mastering Matrix Maximum SPL MB Machine Head MDM Memory Menu Metering Mic Level Microphone Microprocessor Mid-range (mid, mids) MIDI MIDI Analyser MIDI Bank Change MIDI Controller MIDI Control Change MIDI File MIDI Implementation Chart MIDI In MIDI Merge MIDI Module MIDI Multitimbral Module MIDI Mode MIDI Note Number MIDI Note On MIDI Note Off MIDI Out MIDI Port MIDI Program Change MIDI Splitter MIDI Sync MIDI Thru MIDI Thru Box Mineral Wool Mirror Points Mixer Modal Distribution Modelling Modes (room) Room Modes Monitor (also Loudspeaker ) Monitor Controller Mono Monophonic Mono-synth poly-synth paraphonic Motherboard Moving Coil Microphone M-S (Mid-Side) MTC Mult Multi-sample Multi-timbral Multitrack Mutual Angle Near-coincident Near Field Noise Reduction Noise-shaping Non-registered parameter Number Non-linear Recording Normalise NOS Nyquist Theorum Nut Octave Off-line Off/On-axis Ohm Omnidirectional Open Circuit Open Reel Open Sound Control Operating System Optimisation (of computer) Opto-electronic Device ORTF OSC oscillator Open Sound Control Oscillator Out-of-Phase Polarity Output Impedance Output Sensitivity Overdubbing Overdrive Overload Overtone Pad Pan-pot Parallel Parameter Parameteric EQ Paraphonic Partials Passive Passive Loudspeaker or Monitor Patch Patch Bay Patch Cord PCI Card PCM Peak Peak-Normalisation PFL Phase Polarity Phaser Phantom Power Phono plug (RCA-phono) Pickup Pink Noise Pitch Pitch-bend Pitch-shifter Plug-in Plug-in Power Polar Pattern Polarity Polyphony Poly-mode Poly-Synth Pop Shield Port Portamento Post-production Potentiometer (Pot) Power Amplifier Power supply Powered Loudspeaker or Monitor Post-fade PPM PPQN PQ Coding Pre-amp Pre-emphasis Pre-fade Preset Pressure Print-through Processor Program Change Project Studio Proximity Effect Pulse Wave Pulse-width Modulation Punch-in Punch-out PWM Compression PZM Q Quantisation Quantiser Rack Mount RAM R-DAT Real-time Red Book CD Reflection Release Resistance Resonance Reverb Reverberation Time RF RF Interference RF Capacitor Microphone Ribbon Microphone Rider Ring Modulator RMS Roll-off ROM Room Modes Rotary Encoder Safety Copy Sample Sample rate Sample and Hold SATA Sawtooth Wave Scrape Flutter Scrubbing SCSI Session Tape Sequencer Shockmount Short-Circuit Sibilance Side-chain Signal Signal Chain Signal-to-noise Ratio Sine Wave Single-ended Noise Reduction Slate Slave SMPS SMPTE S/MUX Snake Solid-state Drive (cf. Hard Disk Drive) Sound Card Sound On Sound Soundproofing Spaced Array S/PDIF Speaker (also Loudspeaker and Monitor) Spill SPL SPP Square Wave SRA SSD Standard Midi File Standing Waves Stage Box Snake Stems Step Time Stereo Stereo Recording Angle Sticky Shed Syndrome Stripe Sub-bass Subcode Subgroup Subtractive Synthesis Subwoofer Surge Surround Sustain Swan Neck Gooseneck Sweet Spot Switching Power Supply Sync Synthesis Synthesiser SysEx Talkback Tape Head Tempo Test Tone THD Thru Thunderbolt Timbre Timbral TOSlink Track Tracking Transformer Transients Transmission-Line Transparency Tremolo Transducer Transpose Triangle Wave TRS True Peak Meter Truss Rod TT Plug Tube Tweeter Unbalanced Unison Unity Gain USB USB-C Valve Vari-Mu Compressor VCA VCA Compressor VCA. VCA Group VDU Velocity Vocoder Vocal Booth Isolation Room Voice Vibrato VU Meter Wah Pedal Watt Warmth Waveform Way Wet White Noise Word Clock Wow & Flutter Wrap Write XG XLR X-Y Y-Lead Zenith Zero Crossing Point Zipper Noise |
12
linvaux 2020-06-30 22:51:41 +08:00 1
//*[@id="node-4917231"]/div/div[2]/ul/li/strong/text()
xpath 帮你写好了 |
13
cx524541577 OP @limuyan44 非常感谢大佬,谢谢大佬
|
14
BBCCBB 2020-06-30 22:57:00 +08:00 1
pyquery, bs4, cssselect 这些库都能搞
|
15
Jirajine 2020-06-30 23:01:30 +08:00 1
f12 打开控制台直接贴进去
```javascript let content = ""; (function get_content(node) { for (child of node.children) { if (child.nodeName == "STRONG") { content += child.innerText + "\n"; } get_content(child); } })(document.body) console.log(content); ``` |
16
shanghj 2020-06-30 23:14:20 +08:00 1
bs4,re,xpath 都可以实现需求。
xpath 插件: chrome:xpath helper (好久没用了,不晓得现在功效如何) firefox:ChroPath (根据页面介绍自己配置下,很 ok 。) |
17
ydpro 2020-07-01 09:46:20 +08:00 1
可以使用 selenium 库,再下载一个 Chrome 驱动(要和当前 google 浏览器版本一致
命名为:strong.py from selenium import webdriver import os os.chdir('E:\\strong') wb = webdriver.Chrome() wb.get('https://www.soundonsound.com/sound-advice/glossary-technical-terms?amp') elements = wb.find_elements_by_tag_name('strong') for element in elements: print(element.text) 在终端输入:python strong.py > output.txt # 将输出重定向到 txt |
18
cx524541577 OP @ydpro 非常感谢大佬回复🙏
|
19
ydpro 2020-07-01 16:03:15 +08:00 1
想了一下,在 with 里直接用 for 循环将元素内容直接添加到一个 txt 文件更好,不用再重定向了
from selenium import webdriver import os os.chdir('E:\\strong') wb = webdriver.Chrome() wb.get('网址') filename = 'a.txt' with open(filename,'w') as web_strong: elements = wb.find_elements_by_tag_name('strong') for element in elements: web_strong.write(element.text + '\n') |