Install Django and Pulling URLs and Titles from Safari Tabs - Parts 1 and 2
[Time: 00:00:00] TextExpander ISO 8601 Snippet Fix
Started off the stream with a quick fix for an old post. I wrote this TextExpander snippet back in 2015. It outputs an ISO 8601 timestamp (e.g. 2020-10-04T16:09:34-0400). It's been super handy over the years. The one in my TextExpander works fine, but the one in the post had a bug. I didn't know about it until I got this tweet from @pulamusic.
I took a look at the source code for the page and it looked fine (though, I realized when reviewing the stream it wasn't). So, I sshed into my server and fixed it that way. I had to do that because the link was actually to the old instance of my site. It's still live at
http://alanwsmith.com as compared to the new production version of my site which is
https://www.alanwsmith.com. (It's on my list to get everything setup so the non secure, non-www version redirects. Just gotta get to it.)
[Time: 00:19:00] Installing
Drupal I mean Django
I've decided to use Django to replace my local
launchpad website that's just a bunch of PHP files. (Django, not Drupal, which I confuse the name of every time.) Someone asked why I was going with Django instead of sticking with PHP. My thinking:
- When I write code these days, it's mostly python and that's what Django uses
- I want to use a framework instead of just writing a bunch of code myself. (I've made my own frameworks. I'm happy to not have to do that anymore.)
- I'm not religious about any framework or language stuff so it just works for me
Once I figured out that I was after Django instead of Drupal, I did the quick install then spent about an hour going through the tutorial. Half the time was me getting frustrated with it. I'd looked at Django a few years back and remember the same frustrations. It's not broken. The code works, but there is so much room for improvement. I added "Make a Better Django Tutorial" to my list.
If it wasn't so frustrating, I would have spent more time on it. But, I was beat, so I bailed. I'll get back to it on another stream.
[Time: 01:26:00] Getting Browser Tab URLs
This is a new one that occurred to me when writing up earlier stream notes. I spent a lot of time going back and forth between my text editor and my browser copying URLs and then typing in the titles and notes for the various links I used. My goal was to create a script to automate that process as much as possible.
What's awesome is that most of the work was already done for me. I found this AppleScript that grabs titles and URLs from all the tabs in all the windows of Safari and copies them to the clipboard. One quick edit to put the output in Markdown format and I could have stopped there, but I wanted a little more.
The first thing I was after was a way to fire off the AppleScript from a PHP page on my local launchpad tools site. I'd done some
osascript calls to fire AppleScript over the past week so I was optimistic I could get it to work.
I spent some decent time on it and kept seeing behavior that looked like it should have worked, but didn't. I have some more ideas to try, but given that I'm moving to Django it wasn't worth spending more time on it. Instead, I moved over to using a plain old python script. The reason for that instead of just using the working AppleScript is because I wanted to capture meta tag descriptions of the pages as well.
It might be possible to pull down a web page and parse it with AppleScript, but I have no desire to try. Hence, python. I started using Selenium to do the parsing but ran into some aggravating issues with getting the title of the page (which I didn't need it was just what I was using a tone test).
Seems that for every element on a page, Selenium uses:
element = driver.find_element_by_tag_name("title") print(element.text)
Every element except for the title that is. You have to get the title with
driver.title. Here's an example:
#!/usr/bin/env python3 from selenium import webdriver from selenium.webdriver.support.ui import WebDriverWait from selenium.webdriver.firefox.options import Options def get_details(url): options = Options() options.headless = True driver = webdriver.Firefox(options=options) driver.get(url) wait = WebDriverWait(driver, 10) page_title = driver.title driver.quit() return page_title print(get_details("https://www.alanwsmith.com/"))
Once I got that figured out, I moved on to getting the description. This is around the time I ran out of steam in the first stream and picked up in the second one. In that second stream, the code got complicated enough that I moved over from just a little script to a more structured one that included tests. I made a lot of progress on that, but haven't finished it up yet. I'll do that in the next stream.
Random stuff from the stream.
Python snippet that calls an external command and returns the STDOUT results into a variable
command_response = subprocess.run(['osascript', 'tab-parser.scpt'], stdout=subprocess.PIPE).stdout.decode('utf-8')
These PHP calls to
osascript fired and got Safari to active and bring itself to the front window.
But, when I tried to run the AppleScript file, I couldn't get it to work.
// no go echo(shell_exec('osascript tab-parser.scpt'));
I tried this and about a thousand variations. I expect there's a security thing involved. There's probably a way to do it, but it wasn't worth more effort for me.
I discovered that when you use python's
urllib.request.urlopen(url), you need to wrap it in a
try. Otherwise, your script will explode if it hits something like a 403 error. (And since we're talking about it, you also need to decode the
.read() call to utf-8). Here's a sample:
#!/usr/bin/env python3 import urllib.request def get_web_page(url): try: with urllib.request.urlopen(url) as response: return response.read().decode("utf-8") except: return "" if __name__ == "__main__": html_doc = get_web_page("https://www.alanwsmith.com/") print(html_doc)
Links From The Stream
Here's some of the various links from the stream.
- AppleScript - How to execute a multi line applescript from Terminal – MacOS X Software – Forum
- Beautiful Soup 4.9.0 documentation
- Beautiful Soup Finding if a tag exists - Stack Overflow
- beautifulsoup4 · PyPI
- Built-in Types — Python 3.8.6 documentation
- Call a function from another file? - Stack Overflow
- Capture all tabs in Safari as URLs to the clipboard – theconsultant.net
- Code inspections - Help | IntelliJ IDEA
- Convert bytes to a string - Stack Overflow
- Converting to one line AppleScript - Stack Overflow
- Daring Fireball
- Data Types — Python 3.8.6 documentation
- Errors and Exceptions — Python 3.8.6 documentation
- Extract title with BeautifulSoup - Stack Overflow
- ForLoop - Python Wiki
- Get data from webpage using applescript - Stack Overflow
- Get meta tag content property with BeautifulSoup and Python - Stack Overflow
- Get page title with Selenium WebDriver using Java - Stack Overflow
- Get webpage contents with Python? - Stack Overflow
- get_attribute() element method - Selenium Python - GeeksforGeeks
- Glossary — Python 3.8.6 documentation
- How do I open a generic URL from AppleScript? - Ask Different
- How to call an external command? - Stack Overflow
- How to disable auto show hints in JetBrains IDEs (IntelliJ IDEA, PyCharm, WebStorm) on mouse over - Stack Overflow
- How to use string.replace() in python 3.x - Stack Overflow
- How to write applescript to print TextEdi… - Apple Community
- HTTP error 403 in Python 3 Web Scraping - Stack Overflow
- Is there a built-in function to print all the current properties and values of an object? - Stack Overflow
- Linux see directory tree structure using tree command - nixCraft
- Locating Elements — Selenium Python Bindings 2 documentation
- Meta tags and BeautifulSoup
- Model field reference | Django documentation | Django
- More Control Flow Tools — Python 3.8.6 documentation
- Ned Batchelder: Keep data out of your variable names
- PHP | shell_exec() vs exec() Function - GeeksforGeeks
- PHP: shell_exec - Manual
- Print to Stdout with applescript - Stack Overflow
- Pythex: a Python regular expression editor
- Python BeautifulSoup check if find returns Null object | Edureka Community
- Python Data Types
- Python: Assign split value to multiple variables - Stack Overflow
- Running shell command and capturing the output - Stack Overflow
- Scraping Data on the Web with BeautifulSoup
- Scraping metadata with beautifulsoup : learnpython
- selenium - getTitle() returning current URL instead of page title - Software Quality Assurance & Testing Stack Exchange
- Selenium How To Get Title Text? - Stack Overflow
- Set up a Git repository - Help | PyCharm
- sets — Unordered collections of unique elements — Python 2.7.18 documentation
- Settings | Django documentation | Django
- Sorting a set of values - Stack Overflow
- Sorting HOW TO — Python 3.3.7 documentation
- Split and Count a Python String - Stack Overflow
- Test if children tag exists in beautifulsoup - Stack Overflow
- The try statement
- Time Zone Abbreviations - Time Zones in North America
- Time Zone Abbreviations - Worldwide List
- Time Zones in New York, United States
- urllib.request — Extensible library for opening URLs — Python 3.8.6 documentation
- webdriver - Question about the Selenium getTitle() Method - Software Quality Assurance & Testing Stack Exchange
- WebDriver API — Selenium Python Bindings 2 documentation
- What is close() and quit() commands in Selenium Webdriver? | Zyxware Technologies
- Writing your first Django app, part 3 | Django documentation | Django