| Daniel Lemire's blog |
Daniel Lemire's blog is about life in academia, research in Computer Science, wondering how we can reconcile fast databases and algorithms with the informal and asemantic nature of the world around us. It is broadcasted from Montreal (Canada).
- Recent
- Popular
- Tags (1)
- Subscribers (4)
- Native XML databases: have they taken the world over yet?Yesterday
-
Some years ago, the database research community jumped into XML. Finally, something new to work on! For about 5 years now, I have seen predictions that the XML databases would take the world over. Every organization would soon have its XML database. People would run web sites out of XML databases. Countless start-ups emerged ready to become the next Oracle.
What happened in practise is a bit underwhelming. Oracle, Microsoft, MySQL and others all included some XML support in their relational databases, but native XML databases failed to grasp any market share.
Where are we?
- Regarding programming languages, XQuery finally became a W3C recommendation in January 2007. More or less, XQuery together with XPath specify the equivalent of a select instruction in SQL.
- What if you want to update your XML database?
XUpdate has been around for some time, but it is not widely supported. The W3C is working on something called XQuery Update Facility. - Interfacing XQuery with your favorite programming language is still awkward. We have an API for XML databases (XML:DB), but I am not sure how well it is supported by the various vendors.
Want to take an XML database out for a spin? Some XML databases worthy of mention:
- Put your lectures only easily and for free with PanoptoDecember 2
-
I saw an impressive online course this morning using Panopto. The asynchronous videocasting was really convincing. Basically, the PowerPoint slides are synced with the video, and you can move up or down in the slide deck, with the video syncing automatically. Students can annotate your slides. You can add secondary video feeds or screen capture.What is more is that a trusty colleague said it was really easy. You can do it on his own given a good camera. The catch is that Windows is required. The price is free or relatively cheap.
Update: See the live demo.
Reference: The November press release where Panopto announces the free version of their product.
- Are you really running out of time?December 2
-
A common feeling among creative workers is the lack of time. Yet, most people will run out of energy before they run out of time. A single task that takes you 5 minutes (asking a BDO for IP rights) can drain you out for a week. Another task, like lecturing for 3 hours, can energize you for the rest of the week. Highly productive people do not have more time, but they may have more energy, more method and better feedback on their progress.
I believe that three problems lead us to conclude we lack time:
- You are spending too much time on boring tasks. To be productive, you need to work on projects you love. For this reason, creative people should pick their projects.
- You fail to manage your projects. Without help, you can only keep track of our 7 projects or tasks at any one time. If you want to do more, a method is needed. Myself, I use GTD. But some method is needed to scale up to a large number of projects. Without method, you will drift to unessential tasks and then blame the lack of time to explain why important tasks went unattended.
- You do not measure your progress. You need to get feedback about the quality and quantity of your work. Myself, I put my work under subversion and get daily emails of what files changed. It is a crude by effective measure of my work. Also, tracking your project carefully, at the task level
- Social Networking for Scientists: MendeleyNovember 28
-
Among scientists-bloggers, the new buzz word is Mendeley: a social networking platform for scientists (Ricardo Vidal, Sylvie Noël, Misha Lemeshko, Michael Kuhn, …). The site is barely getting started and is still in early beta, there are bugs and limitations. However, the London-based has funding and a solid staff.
Their vision statement is compelling:
Mendeley is free social software for managing and sharing research papers. It is also a Web 2.0 site for discovering research trends and connecting to like-minded academics. To achieve our long-term vision of a “Last.fm for research“, we’re working with the former founding engineers of Skype and Last.fm’s former chairman.
Last night I created a profile. I got tired of entering my papers and I stopped entering them around 2005-2006. If you have 100 published papers, you are going to
- Innovative ideas are indistinguishable from crackpot onesNovember 27
-
It is impossible to distinguish objectively and systematically bogus work from high quality work. You can sort work based on external attributes such as quality of the presentation, length, logical correctness, prestige of the authors, and methodology, but not on the significance of the work. Significance cannot be disproved at the time of the review. Even technical details end up being fundamental ideas: this happens frequently in mathematics where lemmas often outshine theorems on the long term.
I review several research papers every month, and several research funding proposals every year. At best, I can determine that something is badly presented. I can find logical or mathematical errors. Beyond this, my opinion is probably often wrong.
Here are a few things I would have or I have categorized as crackpot ideas:
- Back in 1990, I would have predicted that the WWW was impractical. How can you deal efficiently with broken links? Who is going to maintain all these links? Yet, it works. I almost never encounter a 404 (missing page) error.
- Back in 1991, I would have laughed had anyone that you can efficiently index and categorize over 8 billion dynamic Web pages, much of which appears and disappears frequently. Yet Google, Yahoo and many other search engines are able to index daily the content of my posts. They differentiate my content from webspam. They also determine the authority of my page. Yet, there is no central registry, no form of q
