Nice article on ArsTechnica with some introductory Automator scripting:
How to build Mac OS X services with Automator and shell scripting.
Nice article on ArsTechnica with some introductory Automator scripting:
How to build Mac OS X services with Automator and shell scripting.
In the last part we built a useful workflow that would open a given number of unread article from your Instapaper feed. But we stopped short of the goal, to convert the text of the articles to speech files.
If you look into the library of Automator actions there is one with the promising name “Get Text from Webpage.” However this will extract all the text, usually including all the menus, ads and all the other detritus that clutters webpages these days. The latest version of Safari (( Safari 5, as I write this )) has a functionality called “Reader,” which removes all this clutter and allows the user to focus on just the text. Unfortunately, the “Reader” functionality in Safari is not scriptable.
But before Safari had “Reader” there was the Readability javascriptlet from Arclab90 which does very much the same thing. Since Safari’s AppleScript dictionary allows us to execute arbitrary JavaScript against a webpage, we can use that to extract the relevant text from the article. That saves us from having to recreate the logic of the Readabilty scriptlet in AppleScript.
Do the following with the workflow we built in Part 1:
on run {input, parameters} -- uses the 'Readability' javascript from -- http://lab.arc90.com/experiments/readability/ set readabilityScript to "javascript:(function(){readConvertLinksToFootnotes=false;readStyle='style-newspaper';readSize='size-medium';readMargin='margin-medium';_readability_script=document.createElement('script');_readability_script.type='text/javascript';_readability_script.src='http://lab.arc90.com/experiments/readability/js/readability.js?x='+(Math.random());document.documentElement.appendChild(_readability_script);_readability_css=document.createElement('link');_readability_css.rel='stylesheet';_readability_css.href='http://lab.arc90.com/experiments/readability/css/readability.css';_readability_css.type='text/css';_readability_css.media='all';document.documentElement.appendChild(_readability_css);_readability_print_css=document.createElement('link');_readability_print_css.rel='stylesheet';_readability_print_css.href='http://lab.arc90.com/experiments/readability/css/readability-print.css';_readability_print_css.media='print';_readability_print_css.type='text/css';document.getElementsByTagName('head')[0].appendChild(_readability_print_css);})();" set output to {} tell application "Safari" repeat with x in input set theURL to contents of x make new document with properties {URL:theURL} delay 0.5 repeat until ( (do JavaScript "document.readyState;" in document of window 1) is equal to "complete") delay 0.5 end repeat set d to document of window 1 do JavaScript readabilityScript in d delay 3 repeat until ( (do JavaScript "document.readyState;" in d) is equal to "complete") delay 1 end repeat set thetext to text of d -- remove first three and last four paragraphs since these are Readability links set AppleScript's text item delimiters to return set thetext to (paragraphs 4 through -5 of thetext) as text close d set output to output & {thetext} end repeat end tell return output end run
Let’s slowly go through this code:
output
to store the results.input
variable. In this case the items are the URLs of the Instapaper posts.set theURL to contents of x
repeat
loop.make new document with properties {URL:theURL} delay 0.5
we tell Safari to open a new document with the given URL and pause for a while to let Safari start loading
repeat until ( (do JavaScript "document.readyState;" in document of window 1) is equal to "complete") delay 0.5 end repeat set d to document of window 1
We have to wait until the page is completely loaded before we can apply the Readability script against the page. Unfortunately Safari does not expose the state of the page (loading or complete) to AppleScript. This is however exposed to the JavaScript DOM within the page and we can access DOM information from AppleScript with the do Javascript
event. So we poll the document.readyState
attribute in Javascript until it reports complete
. Then we remember a reference to this document in a variable. ((Safari has a bug where a AppleScript reference to document will change while it is loading, resulting in broken references. All this is a clumsy, but effective workaround.))
do JavaScript readabilityScript in d delay 3 repeat until ( (do JavaScript "document.readyState;" in d) is equal to "complete") delay 1 end repeat
We use the same DOM trick to wait until Safari is done.
text
property of the document contains the cleaned up text of the article. We can extract that, remove some extra lines that Readabilty inserts, close the Safari window and append the text as its own element to the output
list.
set thetext to text of d -- remove first three and last four paragraphs since these are Readability links set AppleScript's text item delimiters to return set thetext to (paragraphs 4 through -5 of thetext) as text close d set output to output & {thetext}
This would be a good time to save the workflow, and do a test run. You can show the results of the workflow in Automator to see if the text is extracted properly. Readability is not perfect and does not work on all pages, but the success rate is quite high.
The remaining work of converting the text into audio is very straightforward. Add the following workflow actions:
And then you are done. You can also download the complete Workflow.
Sync’ing iTunes Libraries | Krypted.com.
So I decided to offload most of my media (photos, movies, etc) off my laptop and onto my Mac Mini server. I also decided that one thing I’d like to live on both is iTunes.
Nice writeup on how to keep a folder in sync with a folder on a server.
I saw this in my Twitter stream the other day:
You know what I want? A text-to-speech plugin for @instapaper so, while commuting to/from work, I can listen to the stuff I find at work.
That shouldn’t be too hard, shouldn’t it?
First we have to get the unread articles from Instapaper. If you go instapaper, log in, and go to your unread articles, you can see the RSS button in the URL field in Safari. To get to the RSS feed in Automator, do the following:
If you ares anything like me this workflow will open quite a large number of pages. I think Instapaper limits the RSS feed to 25. That’s still a lot of new Safari tabs/windows you are opening there. We want to add an action that restricts the number of items passed through it. Surprisingly there is none in the default actions, but this is fairly easy to add. Insert a new “Run AppleScript” action before the “New Safari Documents” action and replace the default code with the following:
on run {input, parameters} set maxNum to 3 -- filters all but the first maxNum items from the articles, change as appropriate -- enter '-1' or remove this action entirely to get all urls if (count of input) > maxNum then set output to items 1 through maxNum of input else set output to input end if return output end run
This will only pass through the first maxNum
of items passed into it, regardless of type. You can change maxNum
to fit your taste and/or needs. You can also set maxNum
to -1
if you want to pass all items without removing the AppleScript action.
Save again and try running it. The next step will be to filter the actual text out of the web page which will be a little tougher and the main topic of Part 2.
ssh
TricksI found this website with a bunch of ssh
tricks. Some highlights:
Compare a Remote File with a Local File
ssh user@host cat /path/to/remotefile | diff /path/to/localfile -Useful for checking if there are differences between local and remote files.
opendiff
((Part of the Developer Tools installed with Xcode)) and bbdiff
((One of the tools installed by BBEdit)) do not use stdin
for their input, but you can work around that by copying the file to /tmp
first:
scp user@host:/path/to/remotefile /tmp/remotefile && opendiff /path/to/localfile /tmp/remotefile
SSH Connection through host in the middle
ssh -t reachable_host ssh unreachable_hostUnreachable_host is unavailable from local network, but it’s available from reachable_host’s network. This command creates a connection to unreachable_host through “hidden” connection to reachable_host.
Using the -t
option uses less overhead on the intermediate host. Same trick is used later in the article where you directly attach to a remote screen
session:
ssh -t remote_host screen -r
Though I prefer using screen -DR
. Read the man page for details.
The next one however didn’t do anything for me, I suspect there is a piece missing in the command somewhere:
Remove a line in a text File
sed -i 8d ~/.ssh/known_hosts
However there is a dedicated tool for this: use
ssh-keygen -R host
instead. I re-image some machines over and over again and then run into the ssh host key errors. This is very useful.
It’s Thanksgiving here in the US. To keep you happy with minimal effort on my side I’ll give you a whole bunch of services to explore, without me (or you) having to write any of them.
Open System Preferences, select the Keyboard preference pane, select the Keyboard Shortcuts Pane and then from the list on the left select “Services.”
There you will find a long list of pre-installed services many of which are disabled by default. Go through the list and enable those that sound promising. “Get Result of AppleScript” and “Add to iTunes as Spoken Track” are two of my favorites.
Any services you have built yourself will also appear in this list. You can disable or re-enable them to keep your context menu trim.
This pane is also where you assign or change keyboard shortcuts. So if there is a service that you use frequently you can further optimize your workflow with a keystroke.
Note: update for macOS Ventura
So you’re writing this email explaining to a customer or colleague on how to do some really cool thing (say hide a file in the Finder) in Terminal. The command for that is chflags
, but of course you can’t remember the exact syntax. So you open Terminal and write man chflags
and find the correct options.1
However reading longer man pages (try ssh
or bash
) in the Terminal can be kind of painful. I’m sure some of you have encountered this command before:
man -t chflags | open -f -a "Preview"
which uses the -t
flag to pass the output to groff
and generate a postscript file which we then pipe into the Preview app, using open
‘s -f
option to pipe the stdin into a file to open
in a GUI app. Preview will then convert the postscript to PDF and display the result.
I think this started to work in Tiger and you should immediately go and add this command to your shell’s profile
.2 Which is nice but you still have to make the roundtrip to the Terminal.3
Enter Snow Leopard Automator Services. Open Automator. Create a new service. Leave the settings to work on ‘text’ in ‘any application’. Search for the ‘Run Shell Script’ action and double click to add to the workflow. Leave the Shell at ‘/bin/bash’ but set the ‘Pass Input’ option to ‘as arguments.’
Replace the default code with
man -t "$1" | open -f -a /Applications/Preview.app
Save the Service and give it a nice name, such as “Open Man Page.”
Then in any application4 you can ctrl/right/double-finger click on a word and “Open Man Page” will be an option in the menu.5 You can even go to System Preferences -> Keyboard -> and add a keyboard shortcut to the command.6 If any other command in the man page strikes your curiosity, just ctrl/right/double-finger click the word in Preview and select “Open Man Page” again.
Another rarely known but quite useful trick is that you can create hyperlinks to man pages with the x-man-page://command
URL.7 This will open the man page in man
in a new Terminal window. This is especially useful in IM sessions.
chflags [no]hidden /path/to/file
function preman() { man -t "$@" | open -f -a "Preview" ;}