Discussion:
[Orgmode] importing google docs document into org
Le Wang
2011-02-17 15:43:26 UTC
Permalink
Hi all,

Does anyone have a good workflow for doing this? I keep a bunch of notes on
google docs with a plain outline structure of using styles "Heading 1", etc,
bullets and hyperlinks. All of this is easily doable in org-mode. It would
be great if I import them into org. Now that I find myself editing in Emacs
more and more, the editing features of Google docs just don't cut it any
more. I'm using Windows with the cygwin tool stack.

As a side note, is there a way to paste rich content (hyperlinks) into emacs
on Windows? I think an org screen cast where this worked on OS X.

Thanks.
--
Le
Puneeth Chaganti
2011-02-17 18:41:45 UTC
Permalink
Post by Le Wang
Hi all,
Does anyone have a good workflow for doing this?  I keep a bunch of notes on
google docs with a plain outline structure of using styles "Heading 1", etc,
bullets and hyperlinks.  All of this is easily doable in org-mode.  It would
be great if I import them into org.  Now that I find myself editing in Emacs
more and more, the editing features of Google docs just don't cut it any
more.  I'm using Windows with the cygwin tool stack.
T V Raman's g-client package allows you to edit google docs from with
in Emacs and it also has support for publishing from org. More
information is available in this blog-post [0] that I came across.

HTH,
Puneeth

[0] http://blog.vivekhaldar.com/post/1649745633/editing-google-docs-in-emacs
Le Wang
2011-02-18 05:02:33 UTC
Permalink
Thanks. I just got this working, and it's only imports google docs as text.
I'm really after some way to import rich format (not so rich, heading tags,
bullets and hyperlinks).

I can get the information out of google docs easily enough as html, pdf,
word, etc. But how do I get it into org-mode?
Post by Puneeth Chaganti
Post by Le Wang
Hi all,
Does anyone have a good workflow for doing this? I keep a bunch of notes
on
Post by Le Wang
google docs with a plain outline structure of using styles "Heading 1",
etc,
Post by Le Wang
bullets and hyperlinks. All of this is easily doable in org-mode. It
would
Post by Le Wang
be great if I import them into org. Now that I find myself editing in
Emacs
Post by Le Wang
more and more, the editing features of Google docs just don't cut it any
more. I'm using Windows with the cygwin tool stack.
T V Raman's g-client package allows you to edit google docs from with
in Emacs and it also has support for publishing from org. More
information is available in this blog-post [0] that I came across.
HTH,
Puneeth
[0]
http://blog.vivekhaldar.com/post/1649745633/editing-google-docs-in-emacs
--
Le
Puneeth Chaganti
2011-02-18 05:24:12 UTC
Permalink
Thanks.  I just got this working, and it's only imports google docs as text.
 I'm really after some way to import rich format (not so rich, heading tags,
bullets and hyperlinks).
I can get the information out of google docs easily enough as html, pdf,
word, etc.  But how do I get it into org-mode?
You could try using Pandoc [1]. It can parse various markups
(including html) and reformat it into various other markups (including
org). There may be a few rough ends in the org-exporter, but it might
be good enough for you.

Hope that helps,
Puneeth

[1] - http://johnmacfarlane.net/pandoc/
Le Wang
2011-02-18 12:27:31 UTC
Permalink
Post by Puneeth Chaganti
You could try using Pandoc [1]. It can parse various markups
(including html) and reformat it into various other markups (including
org). There may be a few rough ends in the org-exporter, but it might
be good enough for you.
Outstanding. I exported the document from google docs to html then used
pandoc to convert to org-mode. The conversion wasn't perfect. I had to
manually edit a few things. But it wasn't too painful.

Here are some support functions I made to scrub the hyperlinks.

(defun le::fix-google-doc-link ()
"unhexify URLs and collapse any repeated links.
"
(interactive)
(let ((link-regexp "\\[http://www.google.com/url\\?q=\\(.*
?\\)&sa=.*?\\]"))
(save-excursion
(goto-char (point-min))
(le::collapse-org-consecutive-links)
(goto-char (point-min))
(while (re-search-forward link-regexp nil t)
(replace-match (concat "["
(save-match-data
(url-unhex-string
(match-string-no-properties 1)))
"]") t t)))))


(defun le::collapse-org-consecutive-links ()
"pandoc conversion of google docs HTML sometimes break a
multi-word hyperlink into individual links. This function
collapses them back into one
link."
(interactive)
(let ((link-regex
"\\[\\[\\([^[:space:]]*?\\)\\]\\[\\(\\(?:.\\|\n\\)*?\\)\\]\\]")
url
text)
(while (re-search-forward link-regex nil t)
(setq url (match-string-no-properties 1)
text (match-string-no-properties 2))
(save-match-data
(catch 'done
(while (looking-at link-regex)
(if (string-equal url (match-string-no-properties 1))
(progn
(setq text (concat text (match-string-no-properties 2)))
(replace-match ""))
(throw 'done nil)))))
(when (not (string-equal text (match-string-no-properties 2)))
(replace-match text t t nil 2)))))
--
Le
Puneeth Chaganti
2011-02-18 12:57:38 UTC
Permalink
Post by Puneeth Chaganti
You could try using Pandoc [1]. It can parse various markups
(including html) and reformat it into various other markups (including
org). There may be a few rough ends in the org-exporter, but it might
be good enough for you.
Outstanding.  I exported the document from google docs to html then used
pandoc to convert to org-mode.  The conversion wasn't perfect.  I had to
manually edit a few things.  But it wasn't too painful.
Here are some support functions I made to scrub the hyperlinks.
Glad it worked.

Actually, if you can describe the problems that you were faced with, I
would be interested in spending some time and trying to fix those
problems with Pandoc or atleast report issues and get them fixed. (I
had contributed the original exporter, though my Haskell knowledge is
negligible.)
--
Puneeth
Le Wang
2011-02-20 06:31:28 UTC
Permalink
Post by Puneeth Chaganti
Actually, if you can describe the problems that you were faced with, I
would be interested in spending some time and trying to fix those
problems with Pandoc or atleast report issues and get them fixed. (I
had contributed the original exporter, though my Haskell knowledge is
negligible.)
Upon closer inspection of the google docs exported html, it appears that the
problems I had were not pandoc issues but google docs issues.

Thanks again.
--
Le
Puneeth Chaganti
2011-02-20 08:00:45 UTC
Permalink
On Fri, Feb 18, 2011 at 8:57 PM, Puneeth
Post by Puneeth Chaganti
Actually, if you can describe the problems that you were faced with, I
would be interested in spending some time and trying to fix those
problems with Pandoc or atleast report issues and get them fixed. (I
had contributed the original exporter, though my Haskell knowledge is
negligible.)
Upon closer inspection of the google docs exported html, it appears that the
problems I had were not pandoc issues but google docs issues.
Righto! Thanks for taking the time to inspect.

Thanks,
Puneeth

Continue reading on narkive:
Loading...