Web Scraping Regular Expression Pattern Matches . any character \w word \W NOT word \d digit \D NOT digit \s whitespace \S NOT whitespace [abc] Any of abc [^abc] Not any of abc (abc) Specific capture of “abc” + 1 or more instances * 0 or more instances ? 0 or 1 instance HTML Tag Explanation <!DOCTYPE> Defines document type <html> Defines HTML document <head> Main information about document <title> Title for document <body> Document body <h1> to <h6> Headings <p> Paragraph <br> Line break <!–comment here–> Comment <img> Image <a> Hyperlink <ul> Unordered list <ol> Ordered list <li> List item <style> Style information for a document <div> Section in a document <span> Section in a document