When analyzing categorical data, sometimes Chi-Square just isn't the right distribution for testing goodness-of-fit or testing Independence. So many people recommend a G test instead. http://www.biostathandbook.com/chiind.html
Being a user of R, obviously I'd like to also run this test along with my other tests. A little searching the web, and answers are littered with, "R doesn't have g-test built in, here's code to do it yourself..." Which is 1/2 true, unlike the chisq.test the base R does not appear to have a g-test. I'd rather leave coding of standard statistics to people who really know the ins and outs of the formulas and have a good way to verify the answer.
Well then when I got significant results I started looking for Post-hoc tests. In doing so it turns out that the following also do G-tests as part of their Measures of Association tests (typically used as post-hoc tests):
So there, base R doesn't have it, but at least 3 packages do so people don't need to keep re-writing it.
FYI http://www.rdocumentation.org/ is awesome if you haven't seen it yet.
Ended up needing to configure a few multi-function machines to print and scan via wifi with Linux. Here's the details of what you need to know. Specifically I did a Brother HL-2280DW and an Epson WF-3540 on Ubuntu 12.04
In general set a static IP address, either on the printer or with your home router using DHCP reservations based on MAC address.
Figuring out the device URI was the trickiest part as Ubuntu never seems to guess that quite right. The drivers for printing tend to be found automatically. If that fails both vendors have them available on their website.
Add Printer, from network, give it the ip of the machine, then pick the lpd option.
Device URI: lpd://192.168.1.1xx/BINARY_P1
Go to the brother support site and get the following files for installation.
- Scanner driver
- Scanner Setting File
Now also make sure you have sane installed.
Run the following to register your multi-function
brsaneconfig4 -a name=Brother model=HL-2280DW ip=192.168.1.1xx
Should now work with sane based programs.
Add Printer, from network, give it the ip of the machine, then pick the lpd option.
Device URI lpd://192.168.1.1xx/printers/epson
Search the epson download site for drivers. I needed:
- WF-3540 Series Scanner Driver Linux core package&data package
- WF-3540 Series Scanner Driver Linux network plugin package
Install them in that order. Now also make sure you have sane installed. Then edit /etc/sane.d/epkowa.conf (This is the part no one on web seems to describe.) Can't find the file you might needs to install libsane-extras
In the net section add a line with your multi-function ip address.
Save that and now when you open iscan or sane it should find your scanner.
Happened to be making some maps today, and realized 1:110m would be better than 1:10m for small world maps in R (much faster too). I had the whole Natural Earth dataset downloaded in sqlite format. SQLite is great but I can't run spatial queries on that in Spatialite format (they store the geometries differently).
GDAL/OGR to the rescue:
ogr2ogr -f SQLite natearth_vector_spatialite.sqlite natural_earth_vector.sqlite -skip-failures -nlt PROMOTE_TO_MULTI -dsco SPATIALITE=YES
Turns out Spatialite, and I suspect Postgis, don't like when you mix Multi and non Multi geometries if a column is declared Multi. Thankfully EvenR solved this in gdal 1.10 with -nlt PROMOTE_TO_MULTI
A few hours later, 400+MB of great base material for cartography...
Oh wait, try to dissolve countries into UN subregions, what are all those weird partial lines in the middle of what should be solid polygons? Slivers of course, places where the topology of borders are not snapped.
- Processing in QGIS, GRASS tool v.dissolve, advanced set a tolerance
- Buffer the polygons 1st before smushing (Thanks Brian)
CREATE TABLE subregionsT AS SELECT subregion,CastToMultiPolygon(GUnion(Buffer(Geometry,0.00001))) as geometry FROM ne_110m_admin_0_countries GROUP BY subregion;
Solution 1 is probably cleaner, as I don't have to now clip the continents to match the coastline again, but solution 2 let me keep it all in the same db where the data was to start with less steps.
It seems that some PPAs have newer versions of apps than the stock 12.04. This can cause nightmares when you go to install stuff that needs the ia32 mutliarch stuff because the 386 version has to be the same as the amd64 version.
After a couple of days of trying to resolve the packages by hand and force versions I came across this post that uses apt-pinning in the preferences to downgrade everything to stock. http://ubuntuforums.org/showpost.php?p=12246372&postcount=7
Once running stock packages, Wine, skype etc should install....
Re: ia32-libs error [Cant install on amd64]ia32-libs error [Cant install on amd64]
I had a similar problem with broken dependencies when trying to install wine and acroread, just after upgrading to 12.04 from 11.04 (passing over 11.10). It seems that some ppa's I had in 11.04 installed newer versions of applications in the system. After upgrading, the remains of these apps seemed to do some mess in the dependencies.
The solution that seems to work (until now), was found on a german ubuntu board ( http://forum.ubuntuusers.de, posts from user Lasall):
First a downgrade is required and done with the following: create the 'preferences' file: Code:
sudo vi /etc/apt/preferences
and insert the following lines: Code:
Package: * Pin: release a=precise* Pin-Priority: 2012
Pin-Priority must be greater than 1000.
Then you may downgrade the programs with: Code:
sudo apt-get dist-upgrade
Then you may install packages that complained about dependencies, like Code:
sudo apt-get install ia32-libs-multiarch
Finally, you should remove the file you just created: Code:
because else no new updates would be found.
Hope this helps you too!
I had a chance this last week to do a little bit of analysis on the download logs for the OSGeo-Live project. The basics: downloads have increased quite a bit from version 4.5 to 5.0 and the full 4.4 GB iso file is the most popular but that doesn't mean there aren't quite a few people downloading the other variants.
There is some uncertainty in the actual numbers as I haven't had a chance to filter out bots, incomplete downloads, etc... Also for those wondering I do plan to follow up with a Map of downloads by country/region soon but early estimate is people from 100 different countries have downloaded.
These graphs represent data for all of 2011 from 2 of 5 servers, the 2 in California.
Anyone know what the difference is between viewed, entry and exit on awstats?
So if you buy a 3 TB drive (or anything bigger than 2TB) and want to use it as the primary drive for your machine you will need to use a GPT paritioning system instead of the classic MBR.
Here's a couple of tricks/tips which should help:
- You need to be using an OS that has GRUB2
- When partitioning, the 1st partition should be a 1 MB section with the bios_grub flag (recent versions of the Ubuntu installer, at least 11.04 has this option, 10.04 I had to set if with a Live disc and parted)
- When you get to the install GRUB question, if you happen to be installing to something other than /dev/sda say no, and then it will ask you which drive or partition to install to.
Here's a copy of the poster I did for AAG 2011 meeting. It's part of my master's thesis on Geoinformatic techniques for dealing with GPS telemetry data using an Open Source stack.
- Spatialite (SQLite)
See the attached pdf which was created in latex using Beamer and the Beamerposter packages.
I'm not sure why but home routers seem to have a finite lifetime before they start misbehaving in strange ways. Last week mine started acting up, in a way I've never seen before, one in which power cycling seems to have no effect.
What was it doing? It decided I wasn't allowed to view websites or any other type of connection to one very specific subnet - which is where my servers happen to live. The rest of Internet worked as usual.
After, a couple of days of trying to figure out where the problem was I did narrow it down to my home router by plugging my laptop into the Internet service directly - which worked.
Part 2: Now that I knew it was the router and had some traceroutes handy, my best guess was that my computer was sending the request but the data was never coming back from the server, browser gave messages like server took too long to return request. Notice how it didn't say it couldn't find the server. Traceroutes no matter how many hops (should have been 16) keep going with * * * which made me think and endless loop was somewhere.
Fingers started to point to the built in SPI Firewall.
So I tried turning off SPI, NAT filtering...upgrade the router firmware, reset the settings...nothing. (Maybe I need to try the mythical 30-30-30 method to flush the nvram).
Plan B Giving up on the router I went to my spare router. Hooked it all up, got connected, turned on the firewall and wham no Internet at all and reverting the settings didn't fix it.
Good news is I had intentionally bought 2 routers that shipped or were capable of running Linux based open source firmware. (Netgear WGR614Gv8, Asus WL520gu) So began the night of researching how to flash an open source firmware onto a router.
Solution: After reading many pages, and some 20-100 step processes I found a nifty 3 step that worked great the first time. Flashing an Asus wl520gu in 3 steps with Tomato (I actually used Tomato-usb)
Database servers are great, but there's a lot of magic in there sometimes and it can be hard to figure just how much storage is being taken up by what database and which tables.
A nice little hint on how to check the size of the whole or parts of you database server (Postgres): http://feeding.cloud.geek.nz/2009/02/finding-size-of-postgres-database-on.html
Or for the lazy
SELECT pg_database.datname, pg_size_pretty(pg_database_size(pg_database.datname)) AS size FROM pg_database;
Just wanted to share a project I recently became aware of after making a trek over to WhereCamp2011
They've got some great ideas for home brewing some nice science equipment for remote sensing, check it out at http://publiclaboratory.org
Here's a link to my flickr stream with photos of some of their airborne camera platforms.
More and more, when I make a dual boot system it turns out that 6 months to a year down the line the windows partition just isn't needed anymore. But now you've got 10GB+ of disk just sitting out at the front of the drive.
Over the holiday I tackled a shuffling of partitions and here's the important tips I picked up.
- Copy your important data to another drive (an external usb is great)
- Using the Ubuntu disk tools like gparted blank the space where you want to move stuff to.
- Using the Clonezilla live disc (and either partimage or partclone [the new variant that handles ext4]) clone your / partition over to the new space.
- Relabel the UUID of this new partition, otherwise it will be identical to the UUID of the original and the bootloader will quasi load both
uuidgen tune2fs /dev/hdaX -U numbergeneratedbyuuidgen
5.Edit your grub config to boot the new drive. If you reboot into Ubuntu running the update-grub will find it.
- Once you're sure you can boot the relocated / you can add the empty space onto your /home (I always recommend separate / and /home partitions)
Things I also recommend:
- Converting ext3 to ext4
- Creating a Private directory for storing encrypted stuff.
This quarter some students and professors got together to reinvent/recreate/re-instigate Cartography at UC Davis. While this isn't my first Cartography course it's been a bit more realistic in terms of applying the ideas to making maps.
I'll link to the full pdf later. Creative Commons license in the footer applies.
So VMWare server is an interesting product for virtualization. It does some things really well (Like letting you open a desktop OS without installing remote desktop tools) and seems to just fail at others (like a web management tool that you can't get into 1/2 the time).
Tonight's frustration, lack of support for Firefox 3.6. But there's a bit of a workaround. If you go into about:config and find security.enable_ssl2, and set it to true the Web Access site actual seems to work reliably (so far).
However the console to any VM will always timeout. To work around this:
- make sure you've installed the console plugin
- go to your firefox settings directory
- find your way into your profile/extensions/VMWare.../plugins
- way down here you'll find a vmware-vmrc
- to be safe enable execution permission on this and all the other vmware scripts in this folder, in the bin(vmware-vmrc) and in lib (wrapper-gtk24.sh) folders in this directory
- now you can directly call, setup a shortcut or start vmware-vmrc
Linux: vmware-vmrc -h [<hostname>:<port>] [-u <username> -p <password>] [-M <moid> | <datastore path>] Windows: vmware-vmrc.exe -h <hostname>:<port> [-u <username> -p <password>] -M <moid> | <datastore path>
- if you leave off command parameters it will just ask you in the GUI
The port number is really important, no idea what moid is yet. And walla it seems to work. It also seems to be more reliable than the web interface (note there is a tool in the web interface to create a shortcut that does the above, and big surprise it doesn't work in Firefox 3.6 hence the hack around).
Some of you may have arrived here looking for my photos. That site is temporarily down while I shift some things around, upgrade some servers, and come up with a better long term plan of what I want to do.
As it was, I hadn't added any new photos for several years and that seemed quite silly. Primarily because it was a technical issue; who knew moving 100 of photos onto a decent web server where visitors can browse efficiently would be so confusing.
Anyways, be patient, let me know if you have questions. tech at wildintellect dot com
So at the AAG Conference last year, we ran an OSGeo booth. Some representative from North American Cartographic Information Society ( NACIS) approached and invited us to their conference.(It wasn't the 1st time after one of my talks on FOSS previously I had been asked).
Now the important part, the California Chapter gave a 50 minute, 4 app demo at the NACIS "Practical Cartography Day" to an audience of 150. Details Take home message - Cartographers want good svg output.
Notes from the rest of the conference, "Open" was actually mentioned a lot. Here's a rough breakdown of the frequency of relevant topics(In presentations):
- Postgis ++
- OpenLayers(not by name but showed up in slides and on demo sites) +++
- mapnik ++
- GDAL +
- php +++
- OpenSource +++++(Even ESRI)
- Python +++
- OpenStreetmap ++++
- Flash/Flex +++++++
- OGC +
- Inkscape +
- GIMP +
- WMS +
(Maybe I'll post a plot when I get chance)
Next post: Some new public domain datasets people are going to want to get their hands on...
Congratulations to Gary Sherman who's recent book has successfully made it to the shelves of academia. Well that might be in part to our librarian taking advice on what open source gis books are missing that should be on the shelf. Lucky for everyone else, since the publisher didn't classify it as a text book it's also affordable too if you want your own copy, paper or ebook.
Desktop GIS: Mapping the Planet with Open Source. Pragmatic Bookshelf, 360 pages, ISBN 1934356069, http://www.pragprog.com/titles/gsdgis/desktop-gis
Wondering what other books you've missed see the OSGeo Library
It's comes up quite often that I need a flyer for this or that. Just a few pages, sometime quarter, third or half sheets for putting up around campus for people to see. Once you do a few though, it often happens that you just need the same thing again later with a few minor variations. Sure you could just do it all in one application, but when not doing full pages then you have to keep messing with duplicating your information 2-4 times on the same in a way that lines up well with being cut.
This is where layout comes in handy, more specifically I use Scribus. The idea here is make one image and then replicate it multiple times across a page all at once evenly. Well that and make a high resolution ready to print PDF.
So start by making your image/item. In this case I don't have a ton of text and it's kinda free float style (not paragraph) so I used Inkscape, well that and it's the format the flyer was originally given to me in. Had there been more text I would have started with OpenOffice, done the graphics in Inkscape or Gimp and done 100% of the layout in Scribus.
After writing the text, changing and scaling fonts, putting in the image, adjusting transparencies and background colors it's now time to export the image. From Inkscape particularly exporting to bitmap(png) gives you the chance to specify you dpi and ensure it will show up correctly when you insert it in to other documents. For printing I usually use 300dpi, and in this case to cut out dealing with margins only exported the drawing, not the page.
- Now I set a guide to split the page in 1/2.
- Turn on guide snapping and grid snaping.
- Draw an image box, snapping it to the guides.
- Get picture, grab the png export
- Duplicate(copy) and snap a second one onto the bottom 1/2
- PDF export, no compression
And walla, the next Linux User's Group of Davis Installfest flyer is done.
- Inkscape svg
- Export png
- Scribus sla
- Final Product pdf
I ended up wanting to analyze commute paths on several networks, but instructions on how to properly prepare a network file with new points snapped to it as nodes was a little less than clear. I'm not 100% sure this is right but it is pieced together from the command history GRASS stored with each layer in my mapset.
#bring the layer in v.in.ogr -o dsn="/scratch/congelton/davis_ped_net/ped_net_sep28.shp" output="pednets28" min_area=0.0001 snap=-1 #find the nearest line to a point and create a line that connects them v.distance -p from="davissubset@PERMANENT" to="pednetsep28" from_type="point" to_type="point,line,area" from_layer=1 to_layer=1 output="ppl2pednet" dmax=-1 upload="dist" column="dist" #add categories to the distance lines(I think this is required otherwise v.net won't work later, if the cat column is already populated then you can skip this) v.category input="ppl2pednet" output="ppl2pednetcat" type="point,line,boundary,centroid,area" option="add" cat=1 layer=1 step=1 #patch the distance lines to the to the original points, so you have the nodes for v.net v.patch input="ppl2pednetcat,pednets28" output="pplpednet" # patch the distance lines to the network v.patch input="pplpednet,davissubset" output="pplonpednet" #I ran a clean before I did the actual v.net command to make sure I dropped things that wouldn't work, outliers v.clean input="pplonpednet" output="pplonpednetclean3" type="line,point" tool="snap,break" thresh=3,3 #run the network shortest path using the original points as starting points and end points in batch from a csv, the point id is it's cat v.net.path input="pplonpednetclean3" output="dcommute3" type="line,boundary" alayer=1 nlayer=1 file="pplonpednetclean.csv" dmax=1000 #example of the csv #autonumber,Start node cat, end node cat 1 1 3000 2 5 3000 3 6 3000 4 7 3000 5 8 3000 6 9 3000 7 10 3000 8 14 3000 9 15 3000 10 25 3000 11 26 3000 12 27 3000 #yes all my people traveled to the same end point
Things to watch out for:
- A network file should have both lines and points with the same layer number(ie 1_points 1_lines)
- A network file with no cat column in the points component