Nice article on ArsTechnica with some introductory Automator scripting:
How to build Mac OS X services with Automator and shell scripting.
Nice article on ArsTechnica with some introductory Automator scripting:
How to build Mac OS X services with Automator and shell scripting.
In the last part we built a useful workflow that would open a given number of unread article from your Instapaper feed. But we stopped short of the goal, to convert the text of the articles to speech files.
If you look into the library of Automator actions there is one with the promising name “Get Text from Webpage.” However this will extract all the text, usually including all the menus, ads and all the other detritus that clutters webpages these days. The latest version of Safari (( Safari 5, as I write this )) has a functionality called “Reader,” which removes all this clutter and allows the user to focus on just the text. Unfortunately, the “Reader” functionality in Safari is not scriptable.
But before Safari had “Reader” there was the Readability javascriptlet from Arclab90 which does very much the same thing. Since Safari’s AppleScript dictionary allows us to execute arbitrary JavaScript against a webpage, we can use that to extract the relevant text from the article. That saves us from having to recreate the logic of the Readabilty scriptlet in AppleScript.
Do the following with the workflow we built in Part 1:
on run {input, parameters} -- uses the 'Readability' javascript from -- http://lab.arc90.com/experiments/readability/ set readabilityScript to "javascript:(function(){readConvertLinksToFootnotes=false;readStyle='style-newspaper';readSize='size-medium';readMargin='margin-medium';_readability_script=document.createElement('script');_readability_script.type='text/javascript';_readability_script.src='http://lab.arc90.com/experiments/readability/js/readability.js?x='+(Math.random());document.documentElement.appendChild(_readability_script);_readability_css=document.createElement('link');_readability_css.rel='stylesheet';_readability_css.href='http://lab.arc90.com/experiments/readability/css/readability.css';_readability_css.type='text/css';_readability_css.media='all';document.documentElement.appendChild(_readability_css);_readability_print_css=document.createElement('link');_readability_print_css.rel='stylesheet';_readability_print_css.href='http://lab.arc90.com/experiments/readability/css/readability-print.css';_readability_print_css.media='print';_readability_print_css.type='text/css';document.getElementsByTagName('head')[0].appendChild(_readability_print_css);})();" set output to {} tell application "Safari" repeat with x in input set theURL to contents of x make new document with properties {URL:theURL} delay 0.5 repeat until ( (do JavaScript "document.readyState;" in document of window 1) is equal to "complete") delay 0.5 end repeat set d to document of window 1 do JavaScript readabilityScript in d delay 3 repeat until ( (do JavaScript "document.readyState;" in d) is equal to "complete") delay 1 end repeat set thetext to text of d -- remove first three and last four paragraphs since these are Readability links set AppleScript's text item delimiters to return set thetext to (paragraphs 4 through -5 of thetext) as text close d set output to output & {thetext} end repeat end tell return output end run
Let’s slowly go through this code:
output
to store the results.input
variable. In this case the items are the URLs of the Instapaper posts.set theURL to contents of x
repeat
loop.make new document with properties {URL:theURL} delay 0.5
we tell Safari to open a new document with the given URL and pause for a while to let Safari start loading
repeat until ( (do JavaScript "document.readyState;" in document of window 1) is equal to "complete") delay 0.5 end repeat set d to document of window 1
We have to wait until the page is completely loaded before we can apply the Readability script against the page. Unfortunately Safari does not expose the state of the page (loading or complete) to AppleScript. This is however exposed to the JavaScript DOM within the page and we can access DOM information from AppleScript with the do Javascript
event. So we poll the document.readyState
attribute in Javascript until it reports complete
. Then we remember a reference to this document in a variable. ((Safari has a bug where a AppleScript reference to document will change while it is loading, resulting in broken references. All this is a clumsy, but effective workaround.))
do JavaScript readabilityScript in d delay 3 repeat until ( (do JavaScript "document.readyState;" in d) is equal to "complete") delay 1 end repeat
We use the same DOM trick to wait until Safari is done.
text
property of the document contains the cleaned up text of the article. We can extract that, remove some extra lines that Readabilty inserts, close the Safari window and append the text as its own element to the output
list.
set thetext to text of d -- remove first three and last four paragraphs since these are Readability links set AppleScript's text item delimiters to return set thetext to (paragraphs 4 through -5 of thetext) as text close d set output to output & {thetext}
This would be a good time to save the workflow, and do a test run. You can show the results of the workflow in Automator to see if the text is extracted properly. Readability is not perfect and does not work on all pages, but the success rate is quite high.
The remaining work of converting the text into audio is very straightforward. Add the following workflow actions:
And then you are done. You can also download the complete Workflow.
Screen Sharing is a really useful tool in Mac OS X. Most people use it locally and select the Computer from the Sharing area in the Finder sidebar. You can also connect Screen Sharing to a remote host. In Finder select “Connect to Server” from the “Go” menu and enter
vnc://host.example.com/
which will connect Screen Sharing to the address. ((It will use VNC on TCP port 5900 in case you have connection issues.))
You could add the vnc URI to the favorites in the “Connect to Server” dialog, but there is a better way: Screen Sharing remembers the last connections in ~/Library/Application Support/Screen Sharing/
. There you will find the hosts you have connected to as .vncloc
files. Find the host(s) you use most frequently and copy them to the Desktop or your Documents folder. ((anywhere Spotlight will index)) Then rename them to just the hostname or another descriptor. You can now double-click to initiate the Screen Sharing connection. But even better: you can invoke Spotlight, start typing the hostname and the vncloc file should be right there. No matter what you are doing the remote session is just a few keystrokes away.
However, if you prefer to use Apple Remote Desktop over Screen Sharing, this will not work. ARD does not open vncloc files. However, ARD is scriptable, so we can build a workaround. Even better ARD supports Automator, so we don’t even need to write code.
Create more applets for each host you frequently use. if you select multiple computers in the first action, you will get the nice “multi observe” window in Remote Desktop. Or you can replace the “Choose Remote Computers” action with a “Choose Computer Lists” action.
Sync’ing iTunes Libraries | Krypted.com.
So I decided to offload most of my media (photos, movies, etc) off my laptop and onto my Mac Mini server. I also decided that one thing I’d like to live on both is iTunes.
Nice writeup on how to keep a folder in sync with a folder on a server.
I saw this in my Twitter stream the other day:
You know what I want? A text-to-speech plugin for @instapaper so, while commuting to/from work, I can listen to the stuff I find at work.
That shouldn’t be too hard, shouldn’t it?
First we have to get the unread articles from Instapaper. If you go instapaper, log in, and go to your unread articles, you can see the RSS button in the URL field in Safari. To get to the RSS feed in Automator, do the following:
If you ares anything like me this workflow will open quite a large number of pages. I think Instapaper limits the RSS feed to 25. That’s still a lot of new Safari tabs/windows you are opening there. We want to add an action that restricts the number of items passed through it. Surprisingly there is none in the default actions, but this is fairly easy to add. Insert a new “Run AppleScript” action before the “New Safari Documents” action and replace the default code with the following:
on run {input, parameters} set maxNum to 3 -- filters all but the first maxNum items from the articles, change as appropriate -- enter '-1' or remove this action entirely to get all urls if (count of input) > maxNum then set output to items 1 through maxNum of input else set output to input end if return output end run
This will only pass through the first maxNum
of items passed into it, regardless of type. You can change maxNum
to fit your taste and/or needs. You can also set maxNum
to -1
if you want to pass all items without removing the AppleScript action.
Save again and try running it. The next step will be to filter the actual text out of the web page which will be a little tougher and the main topic of Part 2.
Top 5 OmniFocus Applescripts « Simplicity Is Bliss.
The following AppleScript are those which are extremely handy and I use on a daily basis. There are many others available on the forum and other places, but many of them didn’t really add much value to my workflow, which is a pretty standard one, or solve problems I never encountered.
Extended Attribute Sync – Syncing resource forks have historically been a pain for Mac users. In case you don’t know, resource forks are a secret area of a file that certain applications like Quicken, Quark, and OmniGraffle use to store important data. Most sync programs today completely ignore these forks, which results in a corrupted file on the other end. But worry not! Resource forks and other extended attributes now work great with Dropbox. Hooray!
This is great news. I use Dropbox to sync my ~/Library/Scripts
and ~/Library/Services
folder across multiple computers. ((the way I do that is to move all scripts and/or workflows I want to sync to ~/Dropbox/Scripts
or ~/Dropbox/Services
and replace the actual folder in ~/Library
with a symlink: ln -s ~/Dropbox/Scripts ~/Library/Scripts
))
While AppleScript files usually work, some Automator Workflow files store extra information in extended attributes and those would break when syncing. Now with the latest version of Dropbox things work fine.
If you don’t use Dropbox yet, you should try it. You can support scriptingosx.com a little by signing up to Dropbox through this link, then both you and me will get some extra free storage space.
Delete ‘Where From’ metadata from files – Mac OS X Hints.
I knew I would have to do this often, so instead of running a shell script every time I wanted to strip the Where From, I wrote an AppleScript
This is why and how you write scripts. 🙂
This article includes ten more little known Automator hacks, not included in the guide, that you can create in a few easy steps.
on makeuseof.com. (Thanks to @marrathon.)
If you have a bash
script with a while
loop that reads from stdin
and uses ssh
inside that loop, the ssh
command will drain all remaining data from stdin
((This is not only true for ssh
but for any command in the loop that reads from stdin
by default)). This means that only the first line of data will be processed.
I encountered this issue yesterday ((I won’t go into details here, since it is for a very specialized purpose. I will say that it involved jot
, ssh
, an aging Solaris based network appliance, and some new fangly XML/Web 2.0)). This website explains why the behavior occurs and how to avoid it.
A flawed method to run commands on multiple systems entails a shell while loop over hostnames, and a Secure Shell SSH connection to each system. However, the default standard input handling of ssh drains the remaining hosts from the while loop