diff --git a/LICENSE b/LICENSE
new file mode 100644
index 00000000000..cebe0354b23
--- /dev/null
+++ b/LICENSE
@@ -0,0 +1,662 @@
+GNU AFFERO GENERAL PUBLIC LICENSE
+ Version 3, 19 November 2007
+
+ Copyright (C) 2007 Free Software Foundation, Inc.
+ Everyone is permitted to copy and distribute verbatim copies
+ of this license document, but changing it is not allowed.
+
+ Preamble
+
+ The GNU Affero General Public License is a free, copyleft license for
+software and other kinds of works, specifically designed to ensure
+cooperation with the community in the case of network server software.
+
+ The licenses for most software and other practical works are designed
+to take away your freedom to share and change the works. By contrast,
+our General Public Licenses are intended to guarantee your freedom to
+share and change all versions of a program--to make sure it remains free
+software for all its users.
+
+ When we speak of free software, we are referring to freedom, not
+price. Our General Public Licenses are designed to make sure that you
+have the freedom to distribute copies of free software (and charge for
+them if you wish), that you receive source code or can get it if you
+want it, that you can change the software or use pieces of it in new
+free programs, and that you know you can do these things.
+
+ Developers that use our General Public Licenses protect your rights
+with two steps: (1) assert copyright on the software, and (2) offer
+you this License which gives you legal permission to copy, distribute
+and/or modify the software.
+
+ A secondary benefit of defending all users' freedom is that
+improvements made in alternate versions of the program, if they
+receive widespread use, become available for other developers to
+incorporate. Many developers of free software are heartened and
+encouraged by the resulting cooperation. However, in the case of
+software used on network servers, this result may fail to come about.
+The GNU General Public License permits making a modified version and
+letting the public access it on a server without ever releasing its
+source code to the public.
+
+ The GNU Affero General Public License is designed specifically to
+ensure that, in such cases, the modified source code becomes available
+to the community. It requires the operator of a network server to
+provide the source code of the modified version running there to the
+users of that server. Therefore, public use of a modified version, on
+a publicly accessible server, gives the public access to the source
+code of the modified version.
+
+ An older license, called the Affero General Public License and
+published by Affero, was designed to accomplish similar goals. This is
+a different license, not a version of the Affero GPL, but Affero has
+released a new version of the Affero GPL which permits relicensing under
+this license.
+
+ The precise terms and conditions for copying, distribution and
+modification follow.
+
+ TERMS AND CONDITIONS
+
+ 0. Definitions.
+
+ "This License" refers to version 3 of the GNU Affero General Public License.
+
+ "Copyright" also means copyright-like laws that apply to other kinds of
+works, such as semiconductor masks.
+
+ "The Program" refers to any copyrightable work licensed under this
+License. Each licensee is addressed as "you". "Licensees" and
+"recipients" may be individuals or organizations.
+
+ To "modify" a work means to copy from or adapt all or part of the work
+in a fashion requiring copyright permission, other than the making of an
+exact copy. The resulting work is called a "modified version" of the
+earlier work or a work "based on" the earlier work.
+
+ A "covered work" means either the unmodified Program or a work based
+on the Program.
+
+ To "propagate" a work means to do anything with it that, without
+permission, would make you directly or secondarily liable for
+infringement under applicable copyright law, except executing it on a
+computer or modifying a private copy. Propagation includes copying,
+distribution (with or without modification), making available to the
+public, and in some countries other activities as well.
+
+ To "convey" a work means any kind of propagation that enables other
+parties to make or receive copies. Mere interaction with a user through
+a computer network, with no transfer of a copy, is not conveying.
+
+ An interactive user interface displays "Appropriate Legal Notices"
+to the extent that it includes a convenient and prominently visible
+feature that (1) displays an appropriate copyright notice, and (2)
+tells the user that there is no warranty for the work (except to the
+extent that warranties are provided), that licensees may convey the
+work under this License, and how to view a copy of this License. If
+the interface presents a list of user commands or options, such as a
+menu, a prominent item in the list meets this criterion.
+
+ 1. Source Code.
+
+ The "source code" for a work means the preferred form of the work
+for making modifications to it. "Object code" means any non-source
+form of a work.
+
+ A "Standard Interface" means an interface that either is an official
+standard defined by a recognized standards body, or, in the case of
+interfaces specified for a particular programming language, one that
+is widely used among developers working in that language.
+
+ The "System Libraries" of an executable work include anything, other
+than the work as a whole, that (a) is included in the normal form of
+packaging a Major Component, but which is not part of that Major
+Component, and (b) serves only to enable use of the work with that
+Major Component, or to implement a Standard Interface for which an
+implementation is available to the public in source code form. A
+"Major Component", in this context, means a major essential component
+(kernel, window system, and so on) of the specific operating system
+(if any) on which the executable work runs, or a compiler used to
+produce the work, or an object code interpreter used to run it.
+
+ The "Corresponding Source" for a work in object code form means all
+the source code needed to generate, install, and (for an executable
+work) run the object code and to modify the work, including scripts to
+control those activities. However, it does not include the work's
+System Libraries, or general-purpose tools or generally available free
+programs which are used unmodified in performing those activities but
+which are not part of the work. For example, Corresponding Source
+includes interface definition files associated with source files for
+the work, and the source code for shared libraries and dynamically
+linked subprograms that the work is specifically designed to require,
+such as by intimate data communication or control flow between those
+subprograms and other parts of the work.
+
+ The Corresponding Source need not include anything that users
+can regenerate automatically from other parts of the Corresponding
+Source.
+
+ The Corresponding Source for a work in source code form is that
+same work.
+
+ 2. Basic Permissions.
+
+ All rights granted under this License are granted for the term of
+copyright on the Program, and are irrevocable provided the stated
+conditions are met. This License explicitly affirms your unlimited
+permission to run the unmodified Program. The output from running a
+covered work is covered by this License only if the output, given its
+content, constitutes a covered work. This License acknowledges your
+rights of fair use or other equivalent, as provided by copyright law.
+
+ You may make, run and propagate covered works that you do not
+convey, without conditions so long as your license otherwise remains
+in force. You may convey covered works to others for the sole purpose
+of having them make modifications exclusively for you, or provide you
+with facilities for running those works, provided that you comply with
+the terms of this License in conveying all material for which you do
+not control copyright. Those thus making or running the covered works
+for you must do so exclusively on your behalf, under your direction
+and control, on terms that prohibit them from making any copies of
+your copyrighted material outside their relationship with you.
+
+ Conveying under any other circumstances is permitted solely under
+the conditions stated below. Sublicensing is not allowed; section 10
+makes it unnecessary.
+
+ 3. Protecting Users' Legal Rights From Anti-Circumvention Law.
+
+ No covered work shall be deemed part of an effective technological
+measure under any applicable law fulfilling obligations under article
+11 of the WIPO copyright treaty adopted on 20 December 1996, or
+similar laws prohibiting or restricting circumvention of such
+measures.
+
+ When you convey a covered work, you waive any legal power to forbid
+circumvention of technological measures to the extent such circumvention
+is effected by exercising rights under this License with respect to
+the covered work, and you disclaim any intention to limit operation or
+modification of the work as a means of enforcing, against the work's
+users, your or third parties' legal rights to forbid circumvention of
+technological measures.
+
+ 4. Conveying Verbatim Copies.
+
+ You may convey verbatim copies of the Program's source code as you
+receive it, in any medium, provided that you conspicuously and
+appropriately publish on each copy an appropriate copyright notice;
+keep intact all notices stating that this License and any
+non-permissive terms added in accord with section 7 apply to the code;
+keep intact all notices of the absence of any warranty; and give all
+recipients a copy of this License along with the Program.
+
+ You may charge any price or no price for each copy that you convey,
+and you may offer support or warranty protection for a fee.
+
+ 5. Conveying Modified Source Versions.
+
+ You may convey a work based on the Program, or the modifications to
+produce it from the Program, in the form of source code under the
+terms of section 4, provided that you also meet all of these conditions:
+
+ a) The work must carry prominent notices stating that you modified
+ it, and giving a relevant date.
+
+ b) The work must carry prominent notices stating that it is
+ released under this License and any conditions added under section
+ 7. This requirement modifies the requirement in section 4 to
+ "keep intact all notices".
+
+ c) You must license the entire work, as a whole, under this
+ License to anyone who comes into possession of a copy. This
+ License will therefore apply, along with any applicable section 7
+ additional terms, to the whole of the work, and all its parts,
+ regardless of how they are packaged. This License gives no
+ permission to license the work in any other way, but it does not
+ invalidate such permission if you have separately received it.
+
+ d) If the work has interactive user interfaces, each must display
+ Appropriate Legal Notices; however, if the Program has interactive
+ interfaces that do not display Appropriate Legal Notices, your
+ work need not make them do so.
+
+ A compilation of a covered work with other separate and independent
+works, which are not by their nature extensions of the covered work,
+and which are not combined with it such as to form a larger program,
+in or on a volume of a storage or distribution medium, is called an
+"aggregate" if the compilation and its resulting copyright are not
+used to limit the access or legal rights of the compilation's users
+beyond what the individual works permit. Inclusion of a covered work
+in an aggregate does not cause this License to apply to the other
+parts of the aggregate.
+
+ 6. Conveying Non-Source Forms.
+
+ You may convey a covered work in object code form under the terms
+of sections 4 and 5, provided that you also convey the
+machine-readable Corresponding Source under the terms of this License,
+in one of these ways:
+
+ a) Convey the object code in, or embodied in, a physical product
+ (including a physical distribution medium), accompanied by the
+ Corresponding Source fixed on a durable physical medium
+ customarily used for software interchange.
+
+ b) Convey the object code in, or embodied in, a physical product
+ (including a physical distribution medium), accompanied by a
+ written offer, valid for at least three years and valid for as
+ long as you offer spare parts or customer support for that product
+ model, to give anyone who possesses the object code either (1) a
+ copy of the Corresponding Source for all the software in the
+ product that is covered by this License, on a durable physical
+ medium customarily used for software interchange, for a price no
+ more than your reasonable cost of physically performing this
+ conveying of source, or (2) access to copy the
+ Corresponding Source from a network server at no charge.
+
+ c) Convey individual copies of the object code with a copy of the
+ written offer to provide the Corresponding Source. This
+ alternative is allowed only occasionally and noncommercially, and
+ only if you received the object code with such an offer, in accord
+ with subsection 6b.
+
+ d) Convey the object code by offering access from a designated
+ place (gratis or for a charge), and offer equivalent access to the
+ Corresponding Source in the same way through the same place at no
+ further charge. You need not require recipients to copy the
+ Corresponding Source along with the object code. If the place to
+ copy the object code is a network server, the Corresponding Source
+ may be on a different server (operated by you or a third party)
+ that supports equivalent copying facilities, provided you maintain
+ clear directions next to the object code saying where to find the
+ Corresponding Source. Regardless of what server hosts the
+ Corresponding Source, you remain obligated to ensure that it is
+ available for as long as needed to satisfy these requirements.
+
+ e) Convey the object code using peer-to-peer transmission, provided
+ you inform other peers where the object code and Corresponding
+ Source of the work are being offered to the general public at no
+ charge under subsection 6d.
+
+ A separable portion of the object code, whose source code is excluded
+from the Corresponding Source as a System Library, need not be
+included in conveying the object code work.
+
+ A "User Product" is either (1) a "consumer product", which means any
+tangible personal property which is normally used for personal, family,
+or household purposes, or (2) anything designed or sold for incorporation
+into a dwelling. In determining whether a product is a consumer product,
+doubtful cases shall be resolved in favor of coverage. For a particular
+product received by a particular user, "normally used" refers to a
+typical or common use of that class of product, regardless of the status
+of the particular user or of the way in which the particular user
+actually uses, or expects or is expected to use, the product. A product
+is a consumer product regardless of whether the product has substantial
+commercial, industrial or non-consumer uses, unless such uses represent
+the only significant mode of use of the product.
+
+ "Installation Information" for a User Product means any methods,
+procedures, authorization keys, or other information required to install
+and execute modified versions of a covered work in that User Product from
+a modified version of its Corresponding Source. The information must
+suffice to ensure that the continued functioning of the modified object
+code is in no case prevented or interfered with solely because
+modification has been made.
+
+ If you convey an object code work under this section in, or with, or
+specifically for use in, a User Product, and the conveying occurs as
+part of a transaction in which the right of possession and use of the
+User Product is transferred to the recipient in perpetuity or for a
+fixed term (regardless of how the transaction is characterized), the
+Corresponding Source conveyed under this section must be accompanied
+by the Installation Information. But this requirement does not apply
+if neither you nor any third party retains the ability to install
+modified object code on the User Product (for example, the work has
+been installed in ROM).
+
+ The requirement to provide Installation Information does not include a
+requirement to continue to provide support service, warranty, or updates
+for a work that has been modified or installed by the recipient, or for
+the User Product in which it has been modified or installed. Access to a
+network may be denied when the modification itself materially and
+adversely affects the operation of the network or violates the rules and
+protocols for communication across the network.
+
+ Corresponding Source conveyed, and Installation Information provided,
+in accord with this section must be in a format that is publicly
+documented (and with an implementation available to the public in
+source code form), and must require no special password or key for
+unpacking, reading or copying.
+
+ 7. Additional Terms.
+
+ "Additional permissions" are terms that supplement the terms of this
+License by making exceptions from one or more of its conditions.
+Additional permissions that are applicable to the entire Program shall
+be treated as though they were included in this License, to the extent
+that they are valid under applicable law. If additional permissions
+apply only to part of the Program, that part may be used separately
+under those permissions, but the entire Program remains governed by
+this License without regard to the additional permissions.
+
+ When you convey a copy of a covered work, you may at your option
+remove any additional permissions from that copy, or from any part of
+it. (Additional permissions may be written to require their own
+removal in certain cases when you modify the work.) You may place
+additional permissions on material, added by you to a covered work,
+for which you have or can give appropriate copyright permission.
+
+ Notwithstanding any other provision of this License, for material you
+add to a covered work, you may (if authorized by the copyright holders of
+that material) supplement the terms of this License with terms:
+
+ a) Disclaiming warranty or limiting liability differently from the
+ terms of sections 15 and 16 of this License; or
+
+ b) Requiring preservation of specified reasonable legal notices or
+ author attributions in that material or in the Appropriate Legal
+ Notices displayed by works containing it; or
+
+ c) Prohibiting misrepresentation of the origin of that material, or
+ requiring that modified versions of such material be marked in
+ reasonable ways as different from the original version; or
+
+ d) Limiting the use for publicity purposes of names of licensors or
+ authors of the material; or
+
+ e) Declining to grant rights under trademark law for use of some
+ trade names, trademarks, or service marks; or
+
+ f) Requiring indemnification of licensors and authors of that
+ material by anyone who conveys the material (or modified versions of
+ it) with contractual assumptions of liability to the recipient, for
+ any liability that these contractual assumptions directly impose on
+ those licensors and authors.
+
+ All other non-permissive additional terms are considered "further
+restrictions" within the meaning of section 10. If the Program as you
+received it, or any part of it, contains a notice stating that it is
+governed by this License along with a term that is a further
+restriction, you may remove that term. If a license document contains
+a further restriction but permits relicensing or conveying under this
+License, you may add to a covered work material governed by the terms
+of that license document, provided that the further restriction does
+not survive such relicensing or conveying.
+
+ If you add terms to a covered work in accord with this section, you
+must place, in the relevant source files, a statement of the
+additional terms that apply to those files, or a notice indicating
+where to find the applicable terms.
+
+ Additional terms, permissive or non-permissive, may be stated in the
+form of a separately written license, or stated as exceptions;
+the above requirements apply either way.
+
+ 8. Termination.
+
+ You may not propagate or modify a covered work except as expressly
+provided under this License. Any attempt otherwise to propagate or
+modify it is void, and will automatically terminate your rights under
+this License (including any patent licenses granted under the third
+paragraph of section 11).
+
+ However, if you cease all violation of this License, then your
+license from a particular copyright holder is reinstated (a)
+provisionally, unless and until the copyright holder explicitly and
+finally terminates your license, and (b) permanently, if the copyright
+holder fails to notify you of the violation by some reasonable means
+prior to 60 days after the cessation.
+
+ Moreover, your license from a particular copyright holder is
+reinstated permanently if the copyright holder notifies you of the
+violation by some reasonable means, this is the first time you have
+received notice of violation of this License (for any work) from that
+copyright holder, and you cure the violation prior to 30 days after
+your receipt of the notice.
+
+ Termination of your rights under this section does not terminate the
+licenses of parties who have received copies or rights from you under
+this License. If your rights have been terminated and not permanently
+reinstated, you do not qualify to receive new licenses for the same
+material under section 10.
+
+ 9. Acceptance Not Required for Having Copies.
+
+ You are not required to accept this License in order to receive or
+run a copy of the Program. Ancillary propagation of a covered work
+occurring solely as a consequence of using peer-to-peer transmission
+to receive a copy likewise does not require acceptance. However,
+nothing other than this License grants you permission to propagate or
+modify any covered work. These actions infringe copyright if you do
+not accept this License. Therefore, by modifying or propagating a
+covered work, you indicate your acceptance of this License to do so.
+
+ 10. Automatic Licensing of Downstream Recipients.
+
+ Each time you convey a covered work, the recipient automatically
+receives a license from the original licensors, to run, modify and
+propagate that work, subject to this License. You are not responsible
+for enforcing compliance by third parties with this License.
+
+ An "entity transaction" is a transaction transferring control of an
+organization, or substantially all assets of one, or subdividing an
+organization, or merging organizations. If propagation of a covered
+work results from an entity transaction, each party to that
+transaction who receives a copy of the work also receives whatever
+licenses to the work the party's predecessor in interest had or could
+give under the previous paragraph, plus a right to possession of the
+Corresponding Source of the work from the predecessor in interest, if
+the predecessor has it or can get it with reasonable efforts.
+
+ You may not impose any further restrictions on the exercise of the
+rights granted or affirmed under this License. For example, you may
+not impose a license fee, royalty, or other charge for exercise of
+rights granted under this License, and you may not initiate litigation
+(including a cross-claim or counterclaim in a lawsuit) alleging that
+any patent claim is infringed by making, using, selling, offering for
+sale, or importing the Program or any portion of it.
+
+ 11. Patents.
+
+ A "contributor" is a copyright holder who authorizes use under this
+License of the Program or a work on which the Program is based. The
+work thus licensed is called the contributor's "contributor version".
+
+ A contributor's "essential patent claims" are all patent claims
+owned or controlled by the contributor, whether already acquired or
+hereafter acquired, that would be infringed by some manner, permitted
+by this License, of making, using, or selling its contributor version,
+but do not include claims that would be infringed only as a
+consequence of further modification of the contributor version. For
+purposes of this definition, "control" includes the right to grant
+patent sublicenses in a manner consistent with the requirements of
+this License.
+
+ Each contributor grants you a non-exclusive, worldwide, royalty-free
+patent license under the contributor's essential patent claims, to
+make, use, sell, offer for sale, import and otherwise run, modify and
+propagate the contents of its contributor version.
+
+ In the following three paragraphs, a "patent license" is any express
+agreement or commitment, however denominated, not to enforce a patent
+(such as an express permission to practice a patent or covenant not to
+sue for patent infringement). To "grant" such a patent license to a
+party means to make such an agreement or commitment not to enforce a
+patent against the party.
+
+ If you convey a covered work, knowingly relying on a patent license,
+and the Corresponding Source of the work is not available for anyone
+to copy, free of charge and under the terms of this License, through a
+publicly available network server or other readily accessible means,
+then you must either (1) cause the Corresponding Source to be so
+available, or (2) arrange to deprive yourself of the benefit of the
+patent license for this particular work, or (3) arrange, in a manner
+consistent with the requirements of this License, to extend the patent
+license to downstream recipients. "Knowingly relying" means you have
+actual knowledge that, but for the patent license, your conveying the
+covered work in a country, or your recipient's use of the covered work
+in a country, would infringe one or more identifiable patents in that
+country that you have reason to believe are valid.
+
+ If, pursuant to or in connection with a single transaction or
+arrangement, you convey, or propagate by procuring conveyance of, a
+covered work, and grant a patent license to some of the parties
+receiving the covered work authorizing them to use, propagate, modify
+or convey a specific copy of the covered work, then the patent license
+you grant is automatically extended to all recipients of the covered
+work and works based on it.
+
+ A patent license is "discriminatory" if it does not include within
+the scope of its coverage, prohibits the exercise of, or is
+conditioned on the non-exercise of one or more of the rights that are
+specifically granted under this License. You may not convey a covered
+work if you are a party to an arrangement with a third party that is
+in the business of distributing software, under which you make payment
+to the third party based on the extent of your activity of conveying
+the work, and under which the third party grants, to any of the
+parties who would receive the covered work from you, a discriminatory
+patent license (a) in connection with copies of the covered work
+conveyed by you (or copies made from those copies), or (b) primarily
+for and in connection with specific products or compilations that
+contain the covered work, unless you entered into that arrangement,
+or that patent license was granted, prior to 28 March 2007.
+
+ Nothing in this License shall be construed as excluding or limiting
+any implied license or other defenses to infringement that may
+otherwise be available to you under applicable patent law.
+
+ 12. No Surrender of Others' Freedom.
+
+ If conditions are imposed on you (whether by court order, agreement or
+otherwise) that contradict the conditions of this License, they do not
+excuse you from the conditions of this License. If you cannot convey a
+covered work so as to satisfy simultaneously your obligations under this
+License and any other pertinent obligations, then as a consequence you may
+not convey it at all. For example, if you agree to terms that obligate you
+to collect a royalty for further conveying from those to whom you convey
+the Program, the only way you could satisfy both those terms and this
+License would be to refrain entirely from conveying the Program.
+
+ 13. Remote Network Interaction; Use with the GNU General Public License.
+
+ Notwithstanding any other provision of this License, if you modify the
+Program, your modified version must prominently offer all users
+interacting with it remotely through a computer network (if your version
+supports such interaction) an opportunity to receive the Corresponding
+Source of your version by providing access to the Corresponding Source
+from a network server at no charge, through some standard or customary
+means of facilitating copying of software. This Corresponding Source
+shall include the Corresponding Source for any work covered by version 3
+of the GNU General Public License that is incorporated pursuant to the
+following paragraph.
+
+ Notwithstanding any other provision of this License, you have
+permission to link or combine any covered work with a work licensed
+under version 3 of the GNU General Public License into a single
+combined work, and to convey the resulting work. The terms of this
+License will continue to apply to the part which is the covered work,
+but the work with which it is combined will remain governed by version
+3 of the GNU General Public License.
+
+ 14. Revised Versions of this License.
+
+ The Free Software Foundation may publish revised and/or new versions of
+the GNU Affero General Public License from time to time. Such new versions
+will be similar in spirit to the present version, but may differ in detail to
+address new problems or concerns.
+
+ Each version is given a distinguishing version number. If the
+Program specifies that a certain numbered version of the GNU Affero General
+Public License "or any later version" applies to it, you have the
+option of following the terms and conditions either of that numbered
+version or of any later version published by the Free Software
+Foundation. If the Program does not specify a version number of the
+GNU Affero General Public License, you may choose any version ever published
+by the Free Software Foundation.
+
+ If the Program specifies that a proxy can decide which future
+versions of the GNU Affero General Public License can be used, that proxy's
+public statement of acceptance of a version permanently authorizes you
+to choose that version for the Program.
+
+ Later license versions may give you additional or different
+permissions. However, no additional obligations are imposed on any
+author or copyright holder as a result of your choosing to follow a
+later version.
+
+ 15. Disclaimer of Warranty.
+
+ THERE IS NO WARRANTY FOR THE PROGRAM, TO THE EXTENT PERMITTED BY
+APPLICABLE LAW. EXCEPT WHEN OTHERWISE STATED IN WRITING THE COPYRIGHT
+HOLDERS AND/OR OTHER PARTIES PROVIDE THE PROGRAM "AS IS" WITHOUT WARRANTY
+OF ANY KIND, EITHER EXPRESSED OR IMPLIED, INCLUDING, BUT NOT LIMITED TO,
+THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR
+PURPOSE. THE ENTIRE RISK AS TO THE QUALITY AND PERFORMANCE OF THE PROGRAM
+IS WITH YOU. SHOULD THE PROGRAM PROVE DEFECTIVE, YOU ASSUME THE COST OF
+ALL NECESSARY SERVICING, REPAIR OR CORRECTION.
+
+ 16. Limitation of Liability.
+
+ IN NO EVENT UNLESS REQUIRED BY APPLICABLE LAW OR AGREED TO IN WRITING
+WILL ANY COPYRIGHT HOLDER, OR ANY OTHER PARTY WHO MODIFIES AND/OR CONVEYS
+THE PROGRAM AS PERMITTED ABOVE, BE LIABLE TO YOU FOR DAMAGES, INCLUDING ANY
+GENERAL, SPECIAL, INCIDENTAL OR CONSEQUENTIAL DAMAGES ARISING OUT OF THE
+USE OR INABILITY TO USE THE PROGRAM (INCLUDING BUT NOT LIMITED TO LOSS OF
+DATA OR DATA BEING RENDERED INACCURATE OR LOSSES SUSTAINED BY YOU OR THIRD
+PARTIES OR A FAILURE OF THE PROGRAM TO OPERATE WITH ANY OTHER PROGRAMS),
+EVEN IF SUCH HOLDER OR OTHER PARTY HAS BEEN ADVISED OF THE POSSIBILITY OF
+SUCH DAMAGES.
+
+ 17. Interpretation of Sections 15 and 16.
+
+ If the disclaimer of warranty and limitation of liability provided
+above cannot be given local legal effect according to their terms,
+reviewing courts shall apply local law that most closely approximates
+an absolute waiver of all civil liability in connection with the
+Program, unless a warranty or assumption of liability accompanies a
+copy of the Program in return for a fee.
+
+ END OF TERMS AND CONDITIONS
+
+ How to Apply These Terms to Your New Programs
+
+ If you develop a new program, and you want it to be of the greatest
+possible use to the public, the best way to achieve this is to make it
+free software which everyone can redistribute and change under these terms.
+
+ To do so, attach the following notices to the program. It is safest
+to attach them to the start of each source file to most effectively
+state the exclusion of warranty; and each file should have at least
+the "copyright" line and a pointer to where the full notice is found.
+
+
+ Copyright (C)
+
+ This program is free software: you can redistribute it and/or modify
+ it under the terms of the GNU Affero General Public License as published
+ by the Free Software Foundation, either version 3 of the License, or
+ (at your option) any later version.
+
+ This program is distributed in the hope that it will be useful,
+ but WITHOUT ANY WARRANTY; without even the implied warranty of
+ MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
+ GNU Affero General Public License for more details.
+
+ You should have received a copy of the GNU Affero General Public License
+ along with this program. If not, see .
+
+Also add information on how to contact you by electronic and paper mail.
+
+ If your software can interact with users remotely through a computer
+network, you should also make sure that it provides a way for users to
+get its source. For example, if your program is a web application, its
+interface could display a "Source" link that leads users to an archive
+of the code. There are many ways you could offer source, and different
+solutions will be better for different programs; see section 13 for the
+specific requirements.
+
+ You should also get your employer (if you work as a programmer) or school,
+if any, to sign a "copyright disclaimer" for the program, if necessary.
+For more information on this, and how to apply and follow the GNU AGPL, see
+.
+
diff --git a/cgds-matlab/README b/cgds-matlab/README
deleted file mode 100755
index 810f27f6702..00000000000
--- a/cgds-matlab/README
+++ /dev/null
@@ -1,27 +0,0 @@
-cBio Cancer Genomics Data Server (CGDS) Matlab Toolbox v1.05 (June 27 2012)
-Implemented by Erik Larsson (larsson@cbio.mskcc.org)
-
-Unzip in a directory of your choice. If you want to make the functions
-available from any directory, add it to the matlab path using e.g:
-
-addpath('/Users/yourusername/Documents/MATLAB/cdgs')
-
-Type 'helpwin cgds' to get an overview of available functions.
-
-'showdemo cgdstutorial' will get you started quickly.
-
-
-Change log:
-1.06 19/09/13 Adapted to changes in web API: extended clinical data
- format (note that previous version no longer works
- correctly). Removed requirement for trailing '/' in URL
- string. Updated URL in help pages. Updated tutorial to
- comply with new output format for clinical data.
-1.05 27/06/12 Bugfixes to cgdstutorial.m (new CGDS URL, survival plot
- fix)
-1.04 22/12/11 Updated cgdstutorial.m to comply with changed naming
- conventions for case set IDs and genetic profile IDs
-1.03 9/9/11 Minor update to cgdstutorial.m
-1.02 9/5/11 Adapted to changes in web API: getcancerstudies() replaces
- getcancertypes(), new default URL
-1.01 4/28/11 First released version
diff --git a/cgds-matlab/cgdstutorial.m b/cgds-matlab/cgdstutorial.m
deleted file mode 100755
index 3fa356d4e46..00000000000
--- a/cgds-matlab/cgdstutorial.m
+++ /dev/null
@@ -1,68 +0,0 @@
-%% CGDS toolbox examples ('showdemo cgdstutorial')
-% The CGDS toolbox provides a set of functions for retrieving data from the
-% cBio Cancer Genomics Data Portal web API. Get started by adding the CGDS
-% toolbox directory to the path and setting the server URL.
-
-% Modify path to make toolbox functions globally available in matlab.
-% This will depend on install location, and is only necessary if you want
-% to make the functions available from any directory
-addpath('/Users/Erik/Documents/MATLAB/cgds');
-
-% Set web API URL (excluding 'webservice.do', trailing slash optional)
-cgdsURL = 'http://www.cbioportal.org/public-portal/';
-
-%% Show toolbox help
-% Use 'helpwin cgds' if you prefer to display it in the Help window.
-help cgds;
-
-%% Get list of available cancer types
-cancerStudies = getcancerstudies(cgdsURL)
-
-%% Get available genetic profiles for a given cancer type
-% This example retreives available profiles for glioblastoma (GBM).
-geneticProfiles = getgeneticprofiles(cgdsURL, 'gbm_tcga')
-
-%% Get available case lists (collections of samples) for a given cancer type
-caseLists = getcaselists(cgdsURL, 'gbm_tcga')
-
-%% Get multiple types of genetic profile data for a specific gene
-% This fetches both mRNA expression and copy number status for P53 in GBM.
-% The last argument causes data to be returned as a numeric matrix. Set to
-% false when fetching non-numeric data, e.g. mutations. 'gbm_mrna' and
-% 'gbm_gistic' are genetic profile IDs in geneticProfiles.geneticProfileID.
-% 'gbm_all' is a case list ID from caseLists.caseListId.
-profileData = getprofiledata(cgdsURL, 'gbm_tcga_all', ...
- {'gbm_tcga_mrna' 'gbm_tcga_gistic'}, ...
- 'TP53', true)
-
-%% Plot mRNA levels as a function of copy number status
-boxplot(profileData.data(1,:),profileData.data(2,:));
-title('TP53'); xlabel('CNA'); ylabel('mRNA level');
-
-%% Get genetic profile data for multiple specified genes
-% This fetches mutation data for five different genes. Only one genetic
-% profile ID is allowed in this case. Note that genes may be returned in a
-% different order than requested.
-profileData = getprofiledata(cgdsURL, 'gbm_tcga_sequenced', ...
- 'gbm_tcga_mutations', ...
- {'TP53' 'NF1' 'EGFR' 'PTEN' 'IDH1'}, false)
-
-%% Get clinical data for all patients in a given case list
-clinicalData = getclinicaldata(cgdsURL, 'gbm_tcga_sequenced')
-
-%% Survival plots for patients with and without IDH1 mutations
-% Simplified plot that disregards censoring.
-isMutated = ismember(clinicalData.caseId, ...
- profileData.caseId(~strcmp(profileData.data(2,:), 'NaN')));
-overallSurvivalStatus = clinicalData.data(:, ...
- strcmp(clinicalData.clinVariable, 'OS_STATUS'));
-overallSurvivalMonths = str2double(clinicalData.data(:, ...
- strcmp(clinicalData.clinVariable, 'OS_MONTHS')));
-ecdf(overallSurvivalMonths(isMutated), 'function','survivor');
-set(get(gca,'Children'), 'Color', [1 0 0]); hold on;
-ecdf(overallSurvivalMonths(~isMutated), 'function','survivor');
-xlabel('Overall survival (months)'); ylabel('Proportion surviving');
-legend({'IDH1 mutated' 'IDH1 wild type'});
-
-%% Run a function in non-verbose mode
-cancerStudies = getcancerstudies(cgdsURL, 'silent');
diff --git a/cgds-matlab/getcancerstudies.m b/cgds-matlab/getcancerstudies.m
deleted file mode 100644
index afa753e9512..00000000000
--- a/cgds-matlab/getcancerstudies.m
+++ /dev/null
@@ -1,28 +0,0 @@
-function cancerStudies = getcancerstudies(cgdsURL, varargin)
-%GETCANCERSTUDIES Get cancer studies from the cBio CGDS portal.
-% A = GETCANCERSTUDIES(cgdsURL) loads a list of available cancer types
-% into A. cdgsURL points to the CGDS web API, typically
-% http://www.cbioportal.org/public-portal/.
-%
-% The function returns a struct array with the following fields:
-% cancerTypeId, name, description.
-%
-% Field names follow column names as returned by the web API.
-%
-% A = GETcancerStudies(cgdsURL, 'silent')
-% runs the function in non-verbose mode, supressing status and warning
-% messages from the cBio CGDS web API. Any string or numerical
-% (e.g. 'non-verbose' or 0) will have this effect. Error messages are
-% always printed, as these indicate an unrecoverable problem.
-%
-% See also getgeneticprofiles, getcaselists, getprofiledata,
-% getclinicaldata.
-
-verbose = isempty(varargin);
-if ~strcmp(cgdsURL(end), '/') cgdsURL(end + 1) = '/'; end
-
-cells = urlgetcells([cgdsURL 'webservice.do?cmd=getCancerStudies'], verbose);
-
-cancerStudies.cancerTypeId = cells(2:end, 1);
-cancerStudies.name = cells(2:end, 2);
-cancerStudies.description = cells(2:end, 3);
diff --git a/cgds-matlab/getcaselists.m b/cgds-matlab/getcaselists.m
deleted file mode 100755
index 97c5348f9da..00000000000
--- a/cgds-matlab/getcaselists.m
+++ /dev/null
@@ -1,39 +0,0 @@
-function caseLists = getcaselists(cgdsURL, cancerTypeId, varargin)
-%GETCASELISTS Get case lists from the cBio CGDS portal.
-% A = GETCASELISTS(cgdsURL, cancerTypeId) loads available case lists for
-% a specific cancer type into A. cdgsURL points to the CGDS web API,
-% typically http://www.cbioportal.org/public-portal/. cancerTypeId is
-% the cancer type ID, as returned by the getcancertypes function.
-%
-% Variable names follow column names returned by the web API.
-%
-% The function returns a struct array with the following fields:
-% caseListId, caseListName, caseListDescription, cancerTypeId, caseIds.
-% Each element of caseIds contains a cell array of strings.
-%
-% Field names follow column names as returned by the web API.
-%
-% A = GETCASELISTS(cgdsURL, cancerTypeId, 'silent')
-% runs the function in non-verbose mode, supressing status and warning
-% messages from the cBio CGDS web API. Any string or numerical
-% (e.g. 'non-verbose' or 0) will have this effect. Error messages are
-% always printed, as these indicate an unrecoverable problem.
-%
-% See also getcancertypes, getgeneticprofiles, getprofiledata,
-% getclinicaldata.
-
-verbose = isempty(varargin);
-if ~strcmp(cgdsURL(end), '/') cgdsURL(end + 1) = '/'; end
-
-cells = urlgetcells([cgdsURL 'webservice.do?cmd=getCaseLists&cancer_type_id=' cancerTypeId], verbose);
-
-caseLists.caseListId = cells(2:end, 1);
-caseLists.caseListName = cells(2:end, 2);
-caseLists.caseListDescription = cells(2:end, 3);
-caseLists.cancerTypeId = cells(2:end, 4);
-
-% tokenize each case id list
-for i = 2:size(cells, 1),
- thisCaseIds = textscan(cells{i, 5}, '%s', 'delimiter', ' ');
- caseLists.caseIds{i - 1, 1} = thisCaseIds{1};
-end
diff --git a/cgds-matlab/getclinicaldata.m b/cgds-matlab/getclinicaldata.m
deleted file mode 100755
index 6df3b912443..00000000000
--- a/cgds-matlab/getclinicaldata.m
+++ /dev/null
@@ -1,32 +0,0 @@
-function clinicalData = getclinicaldata(cgdsURL, caseListId, varargin)
-%GETCLINICALDATA Get clinical data from the cBio CGDS portal.
-% A = getclinicaldata(cgdsURL, caseListId) loads clinical data into A.
-% cdgsURL points to the CGDS web API, typically
-% http://www.cbioportal.org/public-portal/. caseListId is a case list
-% ID, as returned by the getcaselists function.
-%
-% Returns a struct array with the following fields: data (data matrix),
-% caseId (row labels for data matrix), clinVariable (column labels for
-% data matrix).
-%
-% Since data returned by this function can be of mixed types, everything
-% is given as strings. Use str2double() to convert to numeric format
-% when appropriate.
-%
-% A = getclinicaldata(cgdsURL, caseListId, 'silent')
-% runs the function in non-verbose mode, supressing status and warning
-% messages from the cBio CGDS web API. Any string or numerical
-% (e.g. 'non-verbose' or 0) will have this effect. Error messages are
-% always printed, as these indicate an unrecoverable problem.
-%
-% See also getcancertypes, getgeneticprofiles, getcaselists,
-% getprofiledata.
-
-verbose = isempty(varargin);
-if ~strcmp(cgdsURL(end), '/') cgdsURL(end + 1) = '/'; end
-
-cells = urlgetcells([cgdsURL 'webservice.do?cmd=getClinicalData&case_set_id=' caseListId], verbose);
-
-clinicalData.caseId = cells(2:end, 1);
-clinicalData.clinVariable = cells(1, 2:end)';
-clinicalData.data = cells(2:end, 2:end);
diff --git a/cgds-matlab/getgeneticprofiles.m b/cgds-matlab/getgeneticprofiles.m
deleted file mode 100755
index d6bcfe7d76a..00000000000
--- a/cgds-matlab/getgeneticprofiles.m
+++ /dev/null
@@ -1,32 +0,0 @@
-function geneticProfiles = getgeneticprofiles(cgdsURL, cancerTypeId, varargin)
-%GETGENETICPROFILES Get genetic profiles from the cBio CGDS portal.
-% A = GETGENETICPROFILES(cgdsURL, cancerTypeId) loads a list of
-% available genetic profiles into A. cdgsURL points to the CGDS web API,
-% typically http://www.cbioportal.org/public-portal/. cancerTypeId is
-% the cancer type ID, as returned by the getcancertypes function.
-%
-% The function returns a struct array with the following fields:
-% geneticProfileId, geneticProfileName, geneticProfileDescription,
-% cancerTypeId, geneticAlterationType.
-%
-% Field names follow column names returned by the web API.
-%
-% A = GETGENETICPROFILES(cgdsURL, cancerTypeId, 'silent')
-% runs the function in non-verbose mode, supressing status and warning
-% messages from the cBio CGDS web API. Any string or numerical
-% (e.g. 'non-verbose' or 0) will have this effect. Error messages are
-% always printed, as these indicate an unrecoverable problem.
-%
-% See also getcancertypes, getcaselists, getprofiledata,
-% getclinicaldata.
-
-verbose = isempty(varargin);
-if ~strcmp(cgdsURL(end), '/') cgdsURL(end + 1) = '/'; end
-
-cells = urlgetcells([cgdsURL 'webservice.do?cmd=getGeneticProfiles&cancer_type_id=' cancerTypeId], verbose);
-
-geneticProfiles.geneticProfileId = cells(2:end, 1);
-geneticProfiles.geneticProfileName = cells(2:end, 2);
-geneticProfiles.geneticProfileDescription = cells(2:end, 3);
-geneticProfiles.cancerTypeId = cells(2:end, 4);
-geneticProfiles.geneticAlterationType = cells(2:end, 5);
diff --git a/cgds-matlab/getprofiledata.m b/cgds-matlab/getprofiledata.m
deleted file mode 100755
index df8e0e23739..00000000000
--- a/cgds-matlab/getprofiledata.m
+++ /dev/null
@@ -1,81 +0,0 @@
-function profileData = getprofiledata(cgdsURL, caseListId, geneticProfileId, geneList, toNumeric, varargin)
-%GETPROFILEDATA Get genomic profile data from the cBio CGDS portal.
-% A = GETPROFILEDATA(cgdsURL, caseListId, geneticProfileId, geneList, toNumeric)
-% loads genomic profile data into A. cdgsURL points to the CGDS web API,
-% typically http://www.cbioportal.org/public-portal/. caseListId is a
-% case list ID, as returned by the getcaselists function.
-% geneticProfileId is a cell array of genetic profile IDs, as returned
-% by getgeneticprofiles. geneList is a cell array of HUGO gene symbols
-% or Entrez Gene IDs. If toNumeric is true, data will be returned as a
-% numeric matrix (convenient e.g. for mRNA expression data).
-%
-% This function can be called in two different ways:
-%
-% * Specificy multiple genes (cell array of strings, or single string
-% with symbols separated by , or +) and a single genetic profile ID.
-% Returns a struct array with the following fields: geneId (Entrez Gene
-% IDs), common (HUGO gene symbols), data (data matrix), caseId (column
-% labels for the data matrix).
-%
-% * Specificy a single gene and multiple genetic profile IDs (cell
-% array of strings, or separated by , or +). Returns a struct array with
-% the following fields: geneticProfileId, alterationType, geneId (Entrez
-% Gene ID), common (HUGO gene symbol), data (data matrix).
-%
-% Field names follow column names as returned by the web API.
-%
-% A = GETPROFILEDATA(cgdsURL, caseListId, geneticProfileId, geneList, toNumeric, 'silent')
-% runs the function in non-verbose mode, supressing status and warning
-% messages from the cBio CGDS web API. Any string or numerical
-% (e.g. 'non-verbose' or 0) will have this effect. Error messages are
-% always printed, as these indicate an unrecoverable problem.
-%
-% See also getcancertypes, getgeneticprofiles, getcaselists,
-% getclinicaldata.
-
-verbose = isempty(varargin);
-if ~strcmp(cgdsURL(end), '/') cgdsURL(end + 1) = '/'; end
-
-cells = urlgetcells([cgdsURL 'webservice.do?cmd=getProfileData&case_set_id=' caseListId ...
- '&genetic_profile_id=' cellarraytostr(geneticProfileId) ...
- '&gene_list=' cellarraytostr(geneList)], verbose);
-
-% determine format
-if strcmp(cells(1,1), 'GENE_ID')
- % multiple genes, single genetic profile ID
- profileData.geneId = cells(2:end, 1);
- profileData.common = cells(2:end, 2);
- profileData.caseId = cells(1, 3:end)';
- if toNumeric
- profileData.data = str2double(cells(2:end, 3:end));
- else
- profileData.data = cells(2:end, 3:end);
- end
-else
- % single gene, multiple genetic profile IDs
- profileData.geneticProfileId = cells(2:end, 1);
- profileData.alterationType = cells(2:end, 2);
- profileData.geneId = cells(2:end, 3);
- profileData.common = cells(2:end, 4);
- profileData.caseId = cells(1, 5:end)';
- if toNumeric
- profileData.data = str2double(cells(2:end, 5:end));
- else
- profileData.data = cells(2:end, 5:end);
- end
-end
-
-
-function s = cellarraytostr(sArray)
-% converts a cell array of strings into a single string where elements
-% are separated by '+'. in case sArray is a string rather than cell array,
-% it is simply passed through to s.
-
-if isstr(sArray)
- s = sArray;
-else
- s = sArray{1};
- for i = 2:length(sArray),
- s = [s '+' sArray{i}];
- end
-end
diff --git a/cgds-matlab/html/cgdstutorial.html b/cgds-matlab/html/cgdstutorial.html
deleted file mode 100644
index 00cc1865488..00000000000
--- a/cgds-matlab/html/cgdstutorial.html
+++ /dev/null
@@ -1,231 +0,0 @@
-
-
-
-
- cgdstutorial
The CGDS toolbox provides a set of functions for retrieving data from the cBio Cancer Genomics Data Portal web API. Get started by adding the CGDS toolbox directory to the path and setting the server URL.
% Modify path to make toolbox functions globally available in matlab.
-% This will depend on install location, and is only necessary if you want
-% to make the functions available from any directory
-addpath('/Users/Erik/Documents/MATLAB/cgds');
-
-% Set web API URL (excluding 'webservice.do', trailing slash optional)
-cgdsURL = 'http://www.cbioportal.org/public-portal/';
-
Show toolbox help
Use 'helpwin cgds' if you prefer to display it in the Help window.
help cgds;
-
Contents of cgds:
-
-cgdstutorial - CGDS toolbox examples ('showdemo cgdstutorial')
-getcancerstudies - Get cancer studies from the cBio CGDS portal.
-getcaselists - Get case lists from the cBio CGDS portal.
-getclinicaldata - Get clinical data from the cBio CGDS portal.
-getgeneticprofiles - Get genetic profiles from the cBio CGDS portal.
-getprofiledata - Get genomic profile data from the cBio CGDS portal.
-
-
Get multiple types of genetic profile data for a specific gene
This fetches both mRNA expression and copy number status for P53 in GBM. The last argument causes data to be returned as a numeric matrix. Set to false when fetching non-numeric data, e.g. mutations. 'gbm_mrna' and 'gbm_gistic' are genetic profile IDs in geneticProfiles.geneticProfileID. 'gbm_all' is a case list ID from caseLists.caseListId.
Get genetic profile data for multiple specified genes
This fetches mutation data for five different genes. Only one genetic profile ID is allowed in this case. Note that genes may be returned in a different order than requested.
\ No newline at end of file
diff --git a/cgds-matlab/html/cgdstutorial.png b/cgds-matlab/html/cgdstutorial.png
deleted file mode 100644
index 7ddbbeab276..00000000000
Binary files a/cgds-matlab/html/cgdstutorial.png and /dev/null differ
diff --git a/cgds-matlab/html/cgdstutorial_01.png b/cgds-matlab/html/cgdstutorial_01.png
deleted file mode 100644
index 2c84e4f090d..00000000000
Binary files a/cgds-matlab/html/cgdstutorial_01.png and /dev/null differ
diff --git a/cgds-matlab/html/cgdstutorial_02.png b/cgds-matlab/html/cgdstutorial_02.png
deleted file mode 100644
index bef28738fad..00000000000
Binary files a/cgds-matlab/html/cgdstutorial_02.png and /dev/null differ
diff --git a/cgds-matlab/private/urlgetcells.m b/cgds-matlab/private/urlgetcells.m
deleted file mode 100755
index 8276dfce755..00000000000
--- a/cgds-matlab/private/urlgetcells.m
+++ /dev/null
@@ -1,33 +0,0 @@
-function cells = urlgetcells(url, verbose)
-% this function is only used internally by the CGDS matlab toolbox
-% returns a 2D cell array where each cell contains a tab-delimited 'cell'
-% from the server output
-
-% decrease if out of memory errors are encountered
-nPrealloc = 10000;
-
-S = urlread(url);
-
-rows = textscan(S, '%s', 'delimiter', '\n', 'BufSize', 65535);
-rows = rows{1};
-
-cells = cell(nPrealloc, 1);
-n = 0;
-for i = 1:length(rows)
- thisRow = rows{i};
- if strcmp(thisRow(1), '#')
- if verbose
- fprintf('%s\n', thisRow);
- end
- elseif strcmp(thisRow(1:6), 'Error:')
- fprintf('%s\n', thisRow);
- error('Cgds:getcancertypes:CgdsError','CGDS returned an error.');
- else
- % this row contains data/header rather than status/warnings/errors
- n = n + 1;
- thisCells = textscan(thisRow, '%s', 'delimiter', '\t', 'BufSize', 65535);
- thisCells = thisCells{1};
- cells(n, 1:length(thisCells)) = thisCells;
- end
-end
-cells = cells(1:n, :);
diff --git a/cgds-r/README b/cgds-r/README
deleted file mode 100644
index 5f42b139998..00000000000
--- a/cgds-r/README
+++ /dev/null
@@ -1,7 +0,0 @@
-# Building
-> R CMD check cgdsr
-> R CMD build cgdsr
-
-# Installing
-
-> R CMD INSTALL cgdsr_1.0.1.tar.gz
diff --git a/cgds-r/cgdsr/DESCRIPTION b/cgds-r/cgdsr/DESCRIPTION
deleted file mode 100644
index 307318e0c5f..00000000000
--- a/cgds-r/cgdsr/DESCRIPTION
+++ /dev/null
@@ -1,14 +0,0 @@
-Package: cgdsr
-Type: Package
-Title: R-Based API for accessing the MSKCC Cancer Genomics Data Server
- (CGDS).
-Version: 1.1.30
-Date: 2013-09-16
-Author: Anders Jacobsen
-Maintainer: Anders Jacobsen
-Description: The package provides a basic set of R functions for querying the Cancer Genomics Data Server (CGDS), hosted by the Computational Biology Center at Memorial-Sloan-Kettering Cancer Center (MSKCC).
-License: GPL
-LazyLoad: yes
-URL: http://www.cbioportal.org/
-Depends: R (>= 2.12.0)
-Imports: R.oo, R.methodsS3
diff --git a/cgds-r/cgdsr/NAMESPACE b/cgds-r/cgdsr/NAMESPACE
deleted file mode 100644
index 0c8e1b13b7b..00000000000
--- a/cgds-r/cgdsr/NAMESPACE
+++ /dev/null
@@ -1,15 +0,0 @@
-# - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
-# IMPORTS
-# - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
-# Packages to be imported
-import("R.methodsS3","R.oo")
-
-# Object that must exported explicitly
-
-
-# - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
-# EXPORTS
-# - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
-# Export all public methods, that is, those without a preceeding dot
-# in their names.
-exportPattern("^[^\\.]")
diff --git a/cgds-r/cgdsr/R/cgdsr.R b/cgds-r/cgdsr/R/cgdsr.R
deleted file mode 100644
index 7a0ab83664e..00000000000
--- a/cgds-r/cgdsr/R/cgdsr.R
+++ /dev/null
@@ -1,437 +0,0 @@
-library(R.oo);
-
-setConstructorS3("CGDS", function(url='',verbose=FALSE,ploterrormsg='') {
- extend(Object(), "CGDS",
- .url=url,
- .verbose=verbose,
- .ploterrormsg='')
-})
-
-setMethodS3("processURL","CGDS", private=TRUE, function(x, url, ...) {
- if (x$.verbose) cat(url,"\n")
- df = read.table(url, skip=0, header=TRUE, as.is=TRUE, sep="\t",quote='')
-})
-
-setMethodS3("setPlotErrorMsg","CGDS", function(x, msg, ...) {
- x$.ploterrormsg = msg
- return(msg)
-})
-
-setMethodS3("setVerbose","CGDS", function(x, verbose, ...) {
- x$.verbose = verbose
- return(verbose)
-})
-
-setMethodS3("getCancerStudies","CGDS", function(x, ...) {
- url = paste(x$.url, "webservice.do?cmd=getCancerStudies&",sep="")
- df = processURL(x,url)
- return(df)
-})
-
-setMethodS3("getCaseLists","CGDS", function(x, cancerStudy, ...) {
- url = paste(x$.url, "webservice.do?cmd=getCaseLists&cancer_study_id=", cancerStudy, sep="")
- df = processURL(x,url)
- return(df)
-})
-
-setMethodS3("getGeneticProfiles","CGDS", function(x, cancerStudy, ...) {
- url = paste(x$.url, "webservice.do?cmd=getGeneticProfiles&cancer_study_id=", cancerStudy, sep="")
- df = processURL(x,url)
- return(df)
-})
-
-setMethodS3("getMutationData","CGDS", function(x, caseList, geneticProfile, genes, ...) {
- url = paste(x$.url, "webservice.do?cmd=getMutationData",
- "&case_set_id=", caseList,
- "&genetic_profile_id=", geneticProfile,
- "&gene_list=", paste(genes,collapse=","), sep="")
- df = processURL(x,url)
- return(df)
-})
-
-setMethodS3("getProfileData","CGDS", function(x, genes, geneticProfiles, caseList='', cases=c(), caseIdsKey = '', ...) {
- url = paste(x$.url, "webservice.do?cmd=getProfileData",
- "&gene_list=", paste(genes,collapse=","),
- "&genetic_profile_id=", paste(geneticProfiles,collapse=","),
- "&id_type=", 'gene_symbol',
- sep="")
-
- if (length(cases)>0) { url = paste(url,"&case_list=", paste(cases,collapse=","),sep='')
- } else if (caseIdsKey != '') { url = paste(url,"&case_ids_key=", caseIdsKey,sep='')
- } else { url = paste(url,"&case_set_id=", caseList,sep='') }
-
- df = processURL(x,url)
-
- if (nrow(df) == 0) { return(df) }
-
- m = matrix()
- # process data before returning
- if (length(geneticProfiles) > 1) {
- cnames = df[,1]
- m = t(df[,-c(1:4)])
- colnames(m) = cnames
- } else {
- cnames = df[,2]
- m = t(df[,-c(1:2)])
- colnames(m) = cnames
- }
-
- return(data.frame(m))
-})
-
-setMethodS3("getClinicalData","CGDS", function(x, caseList='', cases=c(), caseIdsKey = '', ...) {
- url = paste(x$.url, "webservice.do?cmd=getClinicalData",sep="")
-
- if (length(cases)>0) { url = paste(url,"&case_list=", paste(cases,collapse=","),sep='')
- } else if (caseIdsKey != '') { url = paste(url,"&case_ids_key=", caseIdsKey,sep='')
- } else { url = paste(url,"&case_set_id=", caseList,sep='') }
-
- df = processURL(x,url)
- rownames(df) = make.names(df[,1])
- return(df[,-1])
-})
-
-setMethodS3("plot","CGDS", function(x, cancerStudy, genes, geneticProfiles, caseList='', cases=c(), caseIdsKey = '', skin='cont', skin.normals='', skin.col.gp = c(), add.corr = '', legend.pos = 'topright', ...) {
-
- errormsg <- function(msg,error=TRUE) {
- # return empty plot with text
- if (error) {msg = paste('Error:',msg)}
- # override msg if global message provided in object
- if (x$.ploterrormsg != '') {msg = x$.ploterrormsg}
- plot.new()
- # set message text here ...
- #mtext(msg,cex=1.0,col='darkred')
- text(0.5,0.5,msg,cex=1.0,col='darkred')
- box()
- return(msg)
- }
-
- # we only allow the following combinations
- # a) gene1 (1 profile) # b) gene1 vs gene2 (1 profile)
- # c) profile 1 vs profile 2 (1 gene)
- if((length(genes) > 1 & length(geneticProfiles) > 1) | (length(genes) > 2 | length(geneticProfiles) > 2)) {
- return(errormsg("use only 2 genetic profiles OR 2 genes"))
- }
-
- # make genenames conform to R variable names
- genesR = make.names(genes)
-
- # get data, check more than zero rows returned, otherwise return
- df = getProfileData(x, genes, geneticProfiles, caseList, cases, caseIdsKey)
- if (nrow(df) == 0) { return(errormsg(paste('empty data frame returned :\n',colnames(df)[1]))) }
-
- # check data returned with more than two genes or genetic profiles
- if (length(genes) == 2 & ncol(df) != 2) { return(errormsg(paste("gene not found:", setdiff(genesR,colnames(df))))) }
- if (length(geneticProfiles) == 2 & ncol(df) != 2) { return(errormsg("geneticProfile ID not found:", setdiff(geneticProfiles,colnames(df)))) }
-
- # get geneticProfiles annotation for axis labels
- gps = getGeneticProfiles(x, cancerStudy)
- if (nrow(gps) == 0) { errormsg(colnames(gps[1])) }
- rownames(gps) = gps[,1]
- gpaxisnames = gps[geneticProfiles,'genetic_profile_name']
- names(gpaxisnames) = geneticProfiles
-
- # we can have a situation where there is no data for a given combination of gene and genetic profile
- # in this case, we have a column of NaN, and we generate an error
- nacols = sapply(df, function(x) all(is.nan(x)))
- if (any(nacols)) {
- if (length(geneticProfiles) > 1) {
- # two genetic profiles in columns, one gene
- return(errormsg(paste(genes, "has no data\n for genetic profile(s):", paste(gpaxisnames[nacols],collapse=", ")),FALSE))
- } else {
- # one genetic profile, one or two genes in columns
- return(errormsg(paste(paste(genes[nacols],collapse=' and '), "has no data\n for genetic profile:", gpaxisnames),FALSE))
- }
- }
-
- # set sub title with correlation if specified
- plot.subtitle = ''
- if ((add.corr == 'pearson' | add.corr == 'spearman') & ncol(df) == 2) {
- ct = cor.test(df[,1],df[,2],method=add.corr)
- plot.subtitle = paste(add.corr, ' r = ', sprintf("%.2f",ct$estimate), ', p = ',sprintf("%.1e",ct$p.value))
- }
-
- ###
- ### Skins
- ###
-
- if (skin == 'cont') {
-
- if(length(genes) == 1 & length(geneticProfiles) == 1) {
- hist(df[,1],xlab=paste(genes," , ",gpaxisnames,sep=""),main='')
- } else if (length(genes) == 2) {
- # two genes
- plot(df[,genesR[1]],df[,genesR[2]] , main = '', xlab = paste(genes[1],", ",gpaxisnames,sep=""), ylab = paste(genes[2],", ",gpaxisnames,sep=""), pch = 1, col = 'black', sub = plot.subtitle)
- } else {
- # two genetic profiles
- gpa = geneticProfiles[1]
- gpb = geneticProfiles[2]
- plot(df[,gpa],df[,gpb] , main = '', xlab = paste(genes[1],", ",gpaxisnames[gpa],sep=""), ylab = paste(genes[1],", ",gpaxisnames[gpb],sep=""), pch = 1, col = 'black', sub = plot.subtitle)
- }
-
- } else if (skin == 'disc') {
-
- if(length(genes) == 1 & length(geneticProfiles) == 1) {
- barplot(table(df[,1]),xlab=paste(genes,", ",gpaxisnames,sep=""),main='',ylab='frequency')
- } else {
- #discrete vs discrete
- return(errormsg('discrete vs. discrete data not implemented'))
- }
-
- } else if (skin =='disc_cont') {
-
- # skin only valid for two genetic profiles
- if(length(geneticProfiles) != 2) {
- return(errormsg("two genetic profiles required for skin 'disc_cont'"))
- } else {
- # skin assumes that first genetic profile is discret
- gp.disc = geneticProfiles[1] # b
- gp.cont = geneticProfiles[2] # a
-
- boxplot(df[,gp.cont] ~ df[,gp.disc], outpch = NA, main = '', xlab = paste(genes[1],", ",gpaxisnames[gp.disc],sep=""), ylab = paste(genes[1],", ",gpaxisnames[gp.cont],sep=""), border = 'gray', sub = plot.subtitle)
- stripchart(df[,gp.cont] ~ df[,gp.disc],vertical = TRUE, add = TRUE, method = 'jitter', pch = 1, col = 'black')
- }
-
- } else if (skin == 'cna_mrna_mut') {
-
- # skin uses parameters
- # * skin.col.gp = mut
- # * skin.normals = normal_case_set [optional]
-
- # fetch optional normal mRNA data
-
- if (skin.normals != '') {
- df.norm = getProfileData(x, genes, geneticProfiles[2], skin.normals)
- if (nrow(df.norm) == 0) { return(errormsg(paste('empty data frame returned :\n',colnames(df.norm)[1]))) }
- # check if data is missing (NaN)
- if ( !all(is.nan(df.norm[,1])) ) {
- # add normal data to dataframe
- df.norm2 = cbind(rep(-3,nrow(df.norm)),df.norm[,1])
- colnames(df.norm2) = geneticProfiles
- df = rbind(df,df.norm2)
- }
- }
-
- # create boxplot
- df.nona = df[apply(df, 1, function(x) {!any(is.na(x))}),]
- ylim=range(df.nona[,geneticProfiles[2]],na.rm=TRUE)
- labels=seq(-3,2)
- names(labels)=c("Normal","Homdel","Hetloss","Diploid","Gain","Amp")
- labels.inuse = sort(unique(df.nona[,geneticProfiles[1]])) # sort removes any NA
-
- boxplot(df.nona[,geneticProfiles[2]] ~ df.nona[,geneticProfiles[1]], main='', outline=FALSE,
- xlab = paste(genes,", ",gpaxisnames[geneticProfiles[1]],sep=""),
- ylab = paste(genes,", ",gpaxisnames[geneticProfiles[2]],sep=""),
- border="gray",ylim=ylim, axes=FALSE, outpch = NA, sub = plot.subtitle)
-
- axis(1,at=seq(1,length(labels.inuse)),labels=names(labels)[match(labels.inuse,labels)],cex.axis=0.8)
- axis(2,cex.axis=0.8,las=2)
- # box()
-
- # manually jitter data
-
- # order data by CNA status
- df.nona=df.nona[order(df.nona[,geneticProfiles[1]]),]
- xy=list()
- xy$MRNA=df.nona[,geneticProfiles[2]]
-
- cats=cbind(1:length(labels.inuse),as.data.frame(table(df.nona[,geneticProfiles[1]])))
- colnames(cats)=c("X","Class","Count")
- rownames(cats)=names(labels)[match(cats[,2],labels)]
-
- xy$JITTER = unlist( apply(cats,1, function(cc) {
- y=rep.int(as.numeric(cc[1]),as.numeric(cc[3]))
- y=y+stats::runif(length(y),-0.1,0.1)
- }))
- xy=as.data.frame(xy)
- colnames(xy) = c('MRNA','JITTER')
-
- # Initialize plotting features
- nonmut.pch = 4
- N=nrow(xy)
- cex=rep(0.9,N)
- pch=rep(nonmut.pch,N)
- col=rep("royalblue",N)
- bg=rep(NA,N)
-
- # fetch mutation data for color coding
- if (length(skin.col.gp == 1)) {
- df.mut = getProfileData(x, genes, skin.col.gp, caseList, cases, caseIdsKey)
- if (nrow(df.mut) == 0) { return(errormsg(paste('empty data frame returned :\n',colnames(df.mut)[1]))) }
- # get matrix corresponding to df.nona
- df.mut = df.mut[rownames(df.nona),1]
- # check if data is missing (NaN)
- mut=which(!is.na(df.mut))
- if(length(mut)>0) {
- # default mutation
- col[mut]="red3"
- bg[mut]="goldenrod"
- pch[mut]=21
- mt=list()
- mt$pch=c(21,23,24,25,22)
- mt$type=c("missense","nonsense","splice","shift","in_frame")
- mt$pattern=c("^[a-z][0-9]+[a-z]$","[*x]$","^(e.+[0-9]|.+_splice)$","fs$","del$")
- mt$bg=c("goldenrod","darkblue","darkgray","black","goldenrod")
- mt=as.data.frame(mt)
- for(i in 1:nrow(mt)) {
- idx=grep(mt$pattern[i],tolower(df.mut))
- pch[idx]=mt$pch[i]
- bg[idx]=as.character(mt$bg[i])
- }
- #col[mut]="red3" # ??????
- legend("topleft",bty="n",
- as.character(as.vector(mt[["type"]])),col="red3",
- pt.bg=as.character(as.vector(mt[["bg"]])),
- pch=mt[["pch"]],cex=0.85,pt.cex=1.0
- )
- }
- }
-
- ## # Plot the jittered data, add mutated points last
- xy.mut = (pch != nonmut.pch)
- points(xy$JITTER[!xy.mut],xy$MRNA[!xy.mut],pch=pch[!xy.mut],cex=cex[!xy.mut],col=col[!xy.mut],bg=bg[!xy.mut])
- points(xy$JITTER[xy.mut],xy$MRNA[xy.mut],pch=pch[xy.mut],cex=cex[xy.mut],col=col[xy.mut],bg=bg[xy.mut])
- box()
-
- } else if (skin == 'meth_mrna_cna_mut' | skin == 'cna_mut') {
-
- # these skins use parameters
- # * skin.col.gp = (cna,mut) [optional]
- # * skin.normals = normal_case_set [optional]
- # [meth_mrna_cna_mut] forces x axis range to [0,1.05]
-
- # fetch cna and mut data for color coding
- pch=rep(1,nrow(df))
- col=rep("black",nrow(df))
-
- if (length(skin.col.gp) == 2) { # color by both CNA and mutation
- df.col = getProfileData(x, genes, skin.col.gp, caseList, cases, caseIdsKey)
- if (nrow(df.col) == 0) { return(errormsg(paste('empty data frame returned :\n',colnames(df.col)[1]))) }
- # because mut is text vector, we need to transform cna to integer vector instead of factor
- cna = as.integer(as.vector(df.col[,skin.col.gp[1]]))
- mut = as.vector(df.col[,skin.col.gp[2]])
- col[cna==-2]="darkblue"
- col[cna==-1]="deepskyblue"
- col[cna==1]="hotpink"
- col[cna==2]="red3"
- col[mut!="NaN"]="orange"
- pch[mut!="NaN"]=20
- }
- else if (length(skin.col.gp) == 1) { # color only by CNA
- df.col = getProfileData(x, genes, skin.col.gp, caseList, cases, caseIdsKey)
- if (nrow(df.col) == 0) { return(errormsg(paste('empty data frame returned :\n',colnames(df.col)[1]))) }
- # because mut is text vector, we need to transform cna to integer vector instead of factor
- cna = as.integer(as.vector(df.col[,1]))
- col[cna==-2]="darkblue"
- col[cna==-1]="deepskyblue"
- col[cna==1]="hotpink"
- col[cna==2]="red3"
- }
-
- # fetch optional normal methylation and mRNA data
- if (skin.normals != '') {
- df.norm = getProfileData(x, genes, geneticProfiles, skin.normals)
- if (nrow(df.norm) == 0) { return(errormsg(paste('empty data frame returned :\n',colnames(df.norm)[1]))) }
- # remove missing data
- df.norm = df.norm[apply(df.norm, 1, function(x) {!any(is.na(x))}),]
- if ( length(df.norm) > 0) {
- df = rbind(df,df.norm)
- col = append(col, rep("black",nrow(df.norm)))
- pch = append(pch, rep(20,nrow(df.norm)))
- }
- }
-
- xlim = range(df[,geneticProfiles[1]],na.rm = TRUE)
- # add 15% to range, to make room for legend
- if (legend.pos == 'topright') {
- xlim = c(min(xlim),max(xlim) + (max(xlim)-min(xlim))*0.15)
- }
- else if (legend.pos == 'topleft') {
- xlim = c(min(xlim) - (max(xlim)-min(xlim))*0.15,max(xlim))
- }
-
- if (skin == 'meth_mrna_cna_mut') {
- # force x axis range to [0,1.05]
- xlim = c(0,1.05)
- }
-
- # now plot
- plot( df[,geneticProfiles[1]],df[,geneticProfiles[2]],main="",
- xlab=paste(genes,", ",gpaxisnames[geneticProfiles[1]],sep=""),
- ylab=paste(genes,", ",gpaxisnames[geneticProfiles[2]],sep=""),
- xlim=xlim,pch=pch,col=col,cex=1.2, sub = plot.subtitle
- )
-
- #abline(lm(d$rna~d$methylation),col="red3",lty=2,lwd=1.5)
- #lines(loess.smooth(d$methylation,d$rna),col="darkgray",lwd=2)
-
- # Replace with dynamically created legend when time permits
- legend(legend.pos,bty="n",
- c("Homdel","Hetloss","Diploid","Gain","Amp","Mutated","Normal"),
- col=c('darkblue','deepskyblue','black','hotpink','red','orange','black'),
- pch=c(1,1,1,1,1,20,20),cex=0.85,pt.cex=1.0
- )
-
- } else {
- return(errormsg(paste("unkown skin:",skin)))
- }
-
- return(TRUE)
-
-})
-
-
-setMethodS3("test","CGDS", function(x, ...) {
- checkEq = function(a,b) { if (identical(a,b)) "OK\n" else "FAILED!\n" }
- checkGrt = function(a,b) { if (a > b) "OK\n" else "FAILED!\n" }
- cancerstudies = getCancerStudies(x)
- cat('getCancerStudies... ',
- checkEq(colnames(cancerstudies),c("cancer_study_id","name","description")))
- ct = cancerstudies[2,1] # should be row 1 instead ...
-
- cat('getCaseLists (1/2) ... ',
- checkEq(colnames(getCaseLists(x,ct)),
- c("case_list_id","case_list_name",
- "case_list_description","cancer_study_id","case_ids")))
- cat('getCaseLists (2/2) ... ',
- checkEq(colnames(getCaseLists(x,'xxx')),
- 'Error..Problem.when.identifying.a.cancer.study.for.the.request.'))
-
- cat('getGeneticProfiles (1/2) ... ',
- checkEq(colnames(getGeneticProfiles(x,ct)),
- c("genetic_profile_id","genetic_profile_name","genetic_profile_description",
- "cancer_study_id","genetic_alteration_type","show_profile_in_analysis_tab")))
- cat('getGeneticProfiles (2/2) ... ',
- checkEq(colnames(getGeneticProfiles(x,'xxx')),
- 'Error..Problem.when.identifying.a.cancer.study.for.the.request.'))
-
- # clinical data
- # check colnames
- cat('getClinicalData (1/1) ... ',
- checkEq(colnames(getClinicalData(x,'gbm_tcga_all'))[1],
- c("DFS_MONTHS")))
-
- # check one gene, one profile
- cat('getProfileData (1/6) ... ',
- checkEq(colnames(getProfileData(x,'NF1','gbm_tcga_mrna','gbm_tcga_all')),
- "NF1"))
- # check many genes, one profile
- cat('getProfileData (2/6) ... ',
- checkEq(colnames(getProfileData(x,c('MDM2','MDM4'),'gbm_tcga_mrna','gbm_tcga_all')),
- c("MDM2","MDM4")))
- # check one gene, many profile
- cat('getProfileData (3/6) ... ',
- checkEq(colnames(getProfileData(x,'NF1',c('gbm_tcga_mrna','gbm_tcga_mutations'),'gbm_tcga_all')),
- c('gbm_tcga_mrna','gbm_tcga_mutations')))
- # check 3 cases returns matrix with 3 columns
- cat('getProfileData (4/6) ... ',
- checkEq(rownames(getProfileData(x,'BRCA1','gbm_tcga_mrna',cases=c('TCGA-02-0001','TCGA-02-0003'))),
- make.names(c('TCGA-02-0001','TCGA-02-0003'))))
- # invalid gene names return empty data.frame
- cat('getProfileData (5/6) ... ',
- checkEq(nrow(getProfileData(x,c('NF10','NF11'),'gbm_tcga_mrna','gbm_tcga_all')),as.integer(0)))
- # invalid case_list_id returns error
- cat('getProfileData (6/6) ... ',
- checkEq(colnames(getProfileData(x,'NF1','gbm_tcga_mrna','xxx')),
- 'Error..Invalid.case_set_id...xxx.'))
-})
diff --git a/cgds-r/cgdsr/inst/doc/cgdsr.Rnw b/cgds-r/cgdsr/inst/doc/cgdsr.Rnw
deleted file mode 100644
index bff770b26b1..00000000000
--- a/cgds-r/cgdsr/inst/doc/cgdsr.Rnw
+++ /dev/null
@@ -1,260 +0,0 @@
-\documentclass[a4paper]{article}
-
-%\VignetteIndexEntry{Introduction to the CGDS R library}
-%\VignettePackage{cgdsr}
-
-% Definitions
-
-\usepackage{url}
-
-\title{The CGDS-R library}
-\author{Anders Jacobsen}
-
-\begin{document}
-
-\maketitle
-
-\tableofcontents
-
-\section{Introduction}
-
-This package provides a basic set of R functions for querying
-the Cancer Genomic Data Server (CGDS) hosted by the Computational
-Biology Center (cBio) at the Memorial Sloan-Kettering Cancer Center (MSKCC). This
-service is a part of the cBio Cancer Genomics Portal,
-\url{http://www.cbioportal.org/}.
-
-In summary, the library can issue the following types of queries:
-
-\begin{itemize}
-\item{
-\texttt{getCancerStudies()} : What cancer studies are hosted on the server?
-For example, TCGA glioblastoma or TCGA ovarian cancer.
-}
-\item{
-\texttt{getGeneticProfiles()} : What genetic profile types are available for
-cancer study X? For example, mRNA expression or copy number alterations.
-}
-\item{
-\texttt{getCaseLists()} : what case sets are available for cancer study X? For
-example, all samples or only samples corresponding to a given cancer subtype.
-}
-\item{
-\texttt{getProfileData()}: Retrieve slices of genomic data. For
-example, a client can retrieve all mutation data for PTEN and EGFR in
-TCGA glioblastoma.
-}
-\item{
-\texttt{getClinicalData()}: Retrieve clinical data (e.g. patient
-survival time and age) for a given cancer study and list of cases.
-}
-\end{itemize}
-
-Each of these functions will be briefly described in the following
-sections. The last part of this document includes some concrete examples
-of how to access and plot the data.
-
-The purpose of this document is to give the reader a quick overview of
-the \texttt{cgdsr} package. Please refer to the corresponding R manual
-pages for a more detailed explanation of arguments and output for each
-function.
-
-\section{The CGDS R interface}
-
-\subsection{\texttt{CGDS()} : Create a CGDS connection object}
-Initially, we will establish a connection to the public CGDS
-server hosted by Memorial Sloan-Kettering Cancer Center. The function
-for creating a CGDS connection object requires the URL of the CGDS
-server service, in this case \url{http://www.cbioportal.org/public-portal/}, as an argument.
-
-<<>>=
-library(cgdsr)
-# Create CGDS object
-mycgds = CGDS("http://www.cbioportal.org/public-portal/")
-@
-
-The variable \texttt{mycgds} is now a CGDS connection object
-pointing at the URL for the public CGDS server. This connection object must
-be included as an argument to all subsequent interface
-calls. Optionally, we can now perform a set of simple tests of the data
-returned from the CGDS connection object using the \texttt{test} function:
-
-<<>>=
-# Test the CGDS endpoint URL using a few simple API tests
-test(mycgds)
-@
-
-\subsection{\texttt{getCancerStudies()} : Retrieve a set of available cancer studies}
-
-Having created a CGDS connection object, we can now retrieve a data
-frame with available cancer studies using the \texttt{getCancerStudies} function:
-
-<<>>=
-# Get list of cancer studies at server
-getCancerStudies(mycgds)[,c(1,2)]
-@
-
-Here we are only showing the first two columns, the cancer study ID and
-short name, of the result data frame. There is also a third column,
-a longer description of the cancer study. The cancer study ID must be
-used in subsequent interface calls to retrieve case lists and genetic
-data profiles (see below).
-
-\subsection{\texttt{getGeneticProfiles()} : Retrieve genetic data profiles for a specific cancer study}
-This function queries the CGDS API and returns the available genetic
-profiles, e.g. mutation or copy number profiles, stored about a
-specific cancer study. Below we list the current genetic profiles for
-the TCGA glioblastoma cancer study:
-
-<<>>=
-getGeneticProfiles(mycgds,'gbm_tcga')[,c(1:2)]
-@
-
-Here we are only listing the first two columns, genetic profile ID and
-short name, of the resulting data frame. Please refer to the R manual
-pages for a more extended specification of the arguments and output.
-
-
-\subsection{\texttt{getCaseLists()} : Retrieve case lists for a specific cancer study}
-This function queries the CGDS API and returns available case lists
-for a specific cancer study. For example, within a particular study, only
-some cases may have sequence data, and another subset of cases may
-have been sequenced and treated with a specific therapeutic protocol. Multiple
-case lists may be associated with each cancer study, and this method
-enables you to retrieve meta-data regarding all of these case
-lists. Below we list the current case lists for the TCGA glioblastoma
-cancer study:
-
-<<>>=
-getCaseLists(mycgds,'gbm_tcga')[,c(1:2)]
-@
-
-Here we are only listing the first two columns, case list ID and
-short name, of the resulting data frame. Please refer to the R manual
-pages for a more extended specification of the arguments and output.
-
-\subsection{\texttt{getProfileData()} : Retrieve genomic profile data for genes and genetic profiles}
-The function queries the CGDS API and returns data based on gene(s),
-genetic profile(s), and a case list. The function only allows
-specifying a list of genes and a single genetic profile, or oppositely
-a single gene and a list of genetic profiles. Importantly, the format of the output
-data frame depends on if a single or a list of genes was specified in
-the arguments. Below we are retrieving mRNA expression and copy number
-alteration genetic profiles for the NF1 gene in all samples of the TCGA glioblastoma
-cancer study:
-
-<<>>=
-getProfileData(mycgds, "NF1", c("gbm_tcga_cna_rae","gbm_tcga_mrna"), "gbm_tcga_all")[c(1:5),]
-@
-
-We are here only showing the first five rows of the data frame. In the next example, we are
-retrieving mRNA expression data for the MDM2 and MDM4 genes:
-
-<<>>=
-getProfileData(mycgds, c("MDM2","MDM4"), "gbm_tcga_mrna", "gbm_tcga_all")[c(1:5),]
-@
-
-We are again only showing the first five rows of the data frame.
-
-\subsection{\texttt{getClinicalData()} : Retrieve clinical data for a list of cases}
-The function queries the CGDS API and returns available clinical data (e.g. patient
-survival time and age) for a given case list. Results are returned in
-a data frame with a row for each case and a column for each clinical
-attribute. The available clinical attributes are:
-
-\begin{itemize}
-\item{
-\texttt{overall\_survival\_months}: Overall survival, in months.
-}
-\item{
-\texttt{overall\_survival\_status}: Overall survival status, usually
-indicated as "LIVING" or "DECEASED".
-}
-\item{
-\texttt{disease\_free\_survival\_months}: Disease free survival, in months.
-}
-\item{
-\texttt{disease\_free\_survival\_status}: Disease free survival status, usually indicated as "DiseaseFree" or "Recurred/Progressed".
-}
-\item{
-\texttt{age\_at\_diagnosis}: Age at diagnosis.
-}
-\end{itemize}
-
-Below we retrieve clinical data for the TCGA ovarian cancer dataset (only first five
-cases/rows are shown):
-
-<<>>=
-getClinicalData(mycgds, "ova_all")[c(1:5),]
-@
-
-\section{Examples}
-
-\subsection{Example 1: Association of NF1 copy number alteration and mRNA expression in glioblastoma}
-As a simple example, we will generate a plot of the association between
-copy number alteration (CNA) status and mRNA expression change for the
-NF1 tumor suprpressor gene in glioblastoma. This plot is very similar
-to Figure 2b in the TCGA research network paper on glioblastoma
-(McLendon et al. 2008). The mRNA expression of NF1 has been
-median adjusted on the gene level (by globally subtracting the median expression
-level of NF1 across all samples).
-
-\begin{center}
-<>=
-df = getProfileData(mycgds, "NF1", c("gbm_tcga_cna_rae","gbm_tcga_mrna"), "gbm_tcga_all")
-head(df)
-boxplot(df[,2] ~ df[,1], main="NF1 : CNA status vs mRNA expression", xlab="CNA status", ylab="mRNA expression", outpch = NA)
-stripchart(df[,2] ~ df[,1], vertical=T, add=T, method="jitter",pch=1,col='red')
-@
-\end{center}
-
-Alternatively, the generic \texttt{cgdsr} \texttt{plot()}
-function can be used to generate a similar plot:
-
-\begin{center}
-<>=
-plot(mycgds, "gbm_tcga", "NF1", c("gbm_tcga_cna_rae","gbm_tcga_mrna"), "gbm_tcga_all", skin = 'disc_cont')
-@
-\end{center}
-
-\subsection{Example 2: MDM2 and MDM4 mRNA expression levels in glioblastoma}
-In this example, we evaluate the relationship of MDM2 and MDM4
-expression levels in glioblastoma. mRNA expression levels of MDM2 and MDM4 have been
-median adjusted on the gene level (by globally subtracting the median expression
-level of the individual gene across all samples).
-
-\begin{center}
-<>=
-df = getProfileData(mycgds, c("MDM2","MDM4"), "gbm_tcga_mrna", "gbm_tcga_all")
-head(df)
-plot(df, main="MDM2 and MDM4 mRNA expression", xlab="MDM2 mRNA expression", ylab="MDM4 mRNA expression")
-@
-\end{center}
-
-Alternatively, the generic \texttt{cgdsr} \texttt{plot()}
-function can be used to generate a similar plot:
-
-\begin{center}
-<>=
-plot(mycgds, "gbm_tcga", c("MDM2","MDM4"), "gbm_tcga_mrna" ,"gbm_tcga_all")
-@
-\end{center}
-
-
-\subsection{Example 3: Comparing expression of PTEN in primary and metastatic
- prostate cancer tumors}
-In this example we plot the mRNA expression levels of PTEN in primary
-and metastatic prostate cancer tumors.
-
-\begin{center}
-<>=
-df.pri = getProfileData(mycgds, "PTEN", "prad_mskcc_mrna", "prad_mskcc_primary")
-head(df.pri)
-df.met = getProfileData(mycgds, "PTEN", "prad_mskcc_mrna", "prad_mskcc_mets")
-head(df.met)
-boxplot(list(t(df.pri),t(df.met)), main="PTEN expression in primary and metastatic tumors", xlab="Tumor type", ylab="PTEN mRNA expression",names=c('primary','metastatic'), outpch = NA)
-stripchart(list(t(df.pri),t(df.met)), vertical=T, add=T, method="jitter",pch=1,col='red')
-@
-\end{center}
-
-\end{document}
diff --git a/cgds-r/cgdsr/inst/doc/cgdsr.pdf b/cgds-r/cgdsr/inst/doc/cgdsr.pdf
deleted file mode 100644
index c3abfbbb417..00000000000
Binary files a/cgds-r/cgdsr/inst/doc/cgdsr.pdf and /dev/null differ
diff --git a/cgds-r/cgdsr/man/cgdsr-CGDS.Rd b/cgds-r/cgdsr/man/cgdsr-CGDS.Rd
deleted file mode 100644
index d0ee23567b7..00000000000
--- a/cgds-r/cgdsr/man/cgdsr-CGDS.Rd
+++ /dev/null
@@ -1,32 +0,0 @@
-\name{cgdsr-CGDS}
-\alias{cgdsr-CGDS}
-\alias{CGDS}
-\title{Construct a CGDS connection object}
-\description{Creates a CGDS connection object from a CGDS endpoint URL. This object
- must be passed on to the methods which query the server.}
-\usage{CGDS(url,verbose=FALSE,ploterrormsg='')}
-\arguments{
- \item{url}{A CGDS URL (required).}
- \item{verbose}{A boolean variable specifying verbose output (default FALSE)}
- \item{ploterrormsg}{An optional message to display in plots if an error occurs (default '')}
-}
-\value{ A CGDS connection object. This object must be passed on to the methods which query the server. }
-\author{
-
-}
-\references{
- cBio Cancer Genomics Portal: \url{http://www.cbioportal.org/}
-}
-\seealso{
-\code{\link{cgdsr}},\code{\link{getCancerStudies}},\code{\link{getGeneticProfiles}},\code{\link{getCaseLists}},\code{\link{getProfileData}}
-}
-\examples{
-# Create CGDS object
-mycgds = CGDS("http://www.cbioportal.org/public-portal/")
-
-# Test the CGDS endpoint URL using a few simple API tests
-test(mycgds)
-
-# Get list of cancer studies at server
-getCancerStudies(mycgds)
-}
diff --git a/cgds-r/cgdsr/man/cgdsr-getCancerStudies.Rd b/cgds-r/cgdsr/man/cgdsr-getCancerStudies.Rd
deleted file mode 100644
index b4159f2e1a4..00000000000
--- a/cgds-r/cgdsr/man/cgdsr-getCancerStudies.Rd
+++ /dev/null
@@ -1,41 +0,0 @@
-\name{cgdsr-getCancerStudies}
-\alias{cgdsr-getCancerStudies}
-\alias{getCancerStudies}
-\alias{getCancerStudies.CGDS}
-\title{Get available cancer studies available in CGDS}
-\description{Queries the CGDS API and returns available cancer
- studies. Input is a CGDS object and output is a data.matrix with
- information regarding the different cancer studies.}
-\usage{\method{getCancerStudies}{CGDS}(x, ...)}
-\arguments{
- \item{x}{A CGDS object (required)}
- \item{...}{Not used.}
-}
-\value{A data.frame with three colums:
-\enumerate{
-\item \var{cancer_study_id}: unique ID used to identify the cancer study in
-subsequent interface calls. This is a human readable ID.
-\item \var{name}: short name of the cancer type.
-\item \var{description}: short description of the cancer type, describing the
-source of study.
-}}
-\author{
-
-}
-\references{
- cBio Cancer Genomics Portal: \url{http://www.cbioportal.org/}
-}
-\seealso{
-\code{\link{cgdsr}},\code{\link{CGDS}},\code{\link{getGeneticProfiles}},\code{\link{getCaseLists}}
-}
-\examples{
-# Create CGDS object
-mycgds = CGDS("http://www.cbioportal.org/public-portal/")
-
-getCancerStudies(mycgds)
-
-# Get available case lists (collection of samples) for a given cancer study
-mycancerstudy = getCancerStudies(mycgds)[2,1]
-mycaselist = getCaseLists(mycgds,mycancerstudy)[1,1]
-
-}
diff --git a/cgds-r/cgdsr/man/cgdsr-getCancerTypes.Rd b/cgds-r/cgdsr/man/cgdsr-getCancerTypes.Rd
deleted file mode 100644
index 3bcdf5aced8..00000000000
--- a/cgds-r/cgdsr/man/cgdsr-getCancerTypes.Rd
+++ /dev/null
@@ -1,42 +0,0 @@
-\name{cgdsr-getCancerTypes}
-\alias{cgdsr-getCancerTypes}
-\alias{getCancerTypes}
-\alias{getCancerTypes.CGDS}
-\title{Get available cancer types available in CGDS}
-\description{Queries the CGDS API and returns available cancer
- types. Input is a CGDS object and output is a data.matrix with
- information regarding the different cancer types.}
-\usage{\method{getCancerTypes}{CGDS}(x, ...)}
-\arguments{
- \item{x}{A CGDS object (required)}
- \item{...}{Not used.}
-}
-\value{A data.frame with three colums:
-\enumerate{
-\item \var{cancer_type_id}: unique ID used to identify the cancer type in
-subsequent interface calls. This is a human readable ID. For example,
-"gbm" identifies the TCGA GBM data set.
-\item \var{name}: short name of the cancer type.
-\item \var{description}: short description of the cancer type, describing the
-source of study.
-}}
-\author{
-
-}
-\references{
- cBio Cancer Genomics Portal: \url{http://www.cbioportal.org/cgx/}
-}
-\seealso{
-\code{\link{cgdsr}},\code{\link{CGDS}},\code{\link{getGeneticProfiles}},\code{\link{getCaseLists}}
-}
-\examples{
-# Create CGDS object
-mycgds = CGDS("http://cbio.mskcc.org/cgds-public/")
-
-getCancerTypes(mycgds)
-
-# Get available case lists (collection of samples) for a given cancer type
-mycancertype = getCancerTypes(mycgds)[1,1]
-mycaselist = getCaseLists(mycgds,mycancertype)[1,1]
-
-}
diff --git a/cgds-r/cgdsr/man/cgdsr-getCaseLists.Rd b/cgds-r/cgdsr/man/cgdsr-getCaseLists.Rd
deleted file mode 100644
index 10beab2dec5..00000000000
--- a/cgds-r/cgdsr/man/cgdsr-getCaseLists.Rd
+++ /dev/null
@@ -1,60 +0,0 @@
-\name{cgdsr-getCaseLists}
-\alias{cgdsr-getCaseLists}
-\alias{getCaseLists}
-\alias{getCaseLists.CGDS}
-\title{Get available case lists for a specific cancer study
-}
-\description{Queries the CGDS API and returns available case lists for a
- specific cancer study.
-}
-\usage{\method{getCaseLists}{CGDS}(x,cancerStudy,...)}
-\arguments{
- \item{x}{A CGDS object (required)}
- \item{cancerStudy}{cancer study ID (required)}
- \item{...}{Not used.}
-}
-\value{A data.frame with five columns:
-\enumerate{
-\item \var{case_list_id}: a unique ID used to identify the case list ID in
-subsequent interface calls. This is a human readable ID. For example,
-"gbm_tcga_all" identifies all cases profiles in the TCGA GBM study.
-\item \var{case_list_name}: short name for the case list.
-\item \var{case_list_description}: short description of the case list.
-\item \var{cancer_study_id}: cancer study ID tied to this genetic profile. Will
-match the input cancer_study_id.
-\item \var{case_ids}: space delimited list of all case IDs that make up this case list.
-}
-}
-\details{Queries the CGDS API and returns available case lists for a
- specific cancer study. For example, a within a particular study, only
- some cases may have sequence data, and another subset of cases may
- have been sequenced and treated with a specific therapeutic protocol. Multiple
- case lists may be associated with each cancer study, and this method
- enables you to retrieve meta-data regarding all of these case lists.
-}
-\author{
-
-}
-\references{
- cBio Cancer Genomics Portal: \url{http://www.cbioportal.org/}
-}
-\seealso{
-\code{\link{cgdsr}},\code{\link{CGDS}},\code{\link{getCancerStudies}},\code{\link{getGeneticProfiles}},\code{\link{getProfileData}}
-}
-\examples{
-# Create CGDS object
-mycgds = CGDS("http://www.cbioportal.org/public-portal/")
-
-# Get list of cancer studies at server
-getCancerStudies(mycgds)
-
-# Get available case lists (collection of samples) for a given cancer study
-mycancerstudy = getCancerStudies(mycgds)[2,1]
-mycaselist = getCaseLists(mycgds,mycancerstudy)[1,1]
-
-# Get available genetic profiles
-mygeneticprofile = getGeneticProfiles(mycgds,mycancerstudy)[4,1]
-
-# Get data slices for a specified list of genes, genetic profile and case list
-getProfileData(mycgds,c('BRCA1','BRCA2'),mygeneticprofile,mycaselist)
-}
diff --git a/cgds-r/cgdsr/man/cgdsr-getClinicalData.Rd b/cgds-r/cgdsr/man/cgdsr-getClinicalData.Rd
deleted file mode 100644
index 7e4b854e363..00000000000
--- a/cgds-r/cgdsr/man/cgdsr-getClinicalData.Rd
+++ /dev/null
@@ -1,49 +0,0 @@
-\name{cgdsr-getClinicalData}
-\alias{cgdsr-getClinicalData}
-\alias{getClinicalData}
-\alias{getClinicalData.CGDS}
-\title{Get clinical data for cancer study}
-\description{Queries the CGDS API and returns clinical data for
- a given case list.}
-\usage{\method{getClinicalData}{CGDS}(x, caseList, cases, caseIdsKey, ...)}
-\arguments{
- \item{x}{A CGDS object (required)}
- \item{caseList}{A case list ID}
- \item{cases}{A vector of case IDs}
- \item{caseIdsKey}{Only used by web portal.}
- \item{...}{Not used.}
-}
-\value{A data.frame with rows for each case, rownames corresponding to
- case IDs, and columns:
-\enumerate{
-\item \var{overall_survival_months}: Overall survival, in months.
-\item \var{overall_survival_status}: Overall survival status, usually
-indicated as "LIVING" or "DECEASED".
-\item \var{disease_free_survival_months}: Disease free survival, in months.
-\item \var{disease_free_survival_status}: Disease free survival status,
-usually indicated as "DiseaseFree" or "Recurred/Progressed".
-\item \var{age_at_diagnosis}: Age at diagnosis.
-}}
-\author{
-
-}
-\references{
- cBio Cancer Genomics Portal: \url{http://www.cbioportal.org/}
-}
-\seealso{
-\code{\link{cgdsr}},\code{\link{CGDS}},\code{\link{getCaseLists}}
-}
-\examples{
-# Create CGDS object
-mycgds = CGDS("http://www.cbioportal.org/public-portal/")
-
-getCancerStudies(mycgds)
-
-# Get available case lists (collection of samples) for a given cancer study
-mycancerstudy = getCancerStudies(mycgds)[2,1]
-mycaselist = getCaseLists(mycgds,mycancerstudy)[1,1]
-
-# Get clinical data for caselist
-getClinicalData(mycgds,mycaselist)
-
-}
diff --git a/cgds-r/cgdsr/man/cgdsr-getGeneticProfiles.Rd b/cgds-r/cgdsr/man/cgdsr-getGeneticProfiles.Rd
deleted file mode 100644
index 03195f17c25..00000000000
--- a/cgds-r/cgdsr/man/cgdsr-getGeneticProfiles.Rd
+++ /dev/null
@@ -1,56 +0,0 @@
-\name{cgdsr-getGeneticProfiles}
-\alias{cgdsr-getGeneticProfiles}
-\alias{getGeneticProfiles}
-\alias{getGeneticProfiles.CGDS}
-\title{Get available genetic data profiles for a specific cancer study}
-\description{Queries the CGDS API and returns the available genetic
- profiles, e.g. mutation or copy number profiles, stored about a
- specific cancer study.
-}
-\usage{\method{getGeneticProfiles}{CGDS}(x,cancerStudy,...)}
-\arguments{
- \item{x}{A CGDS object (required)}
- \item{cancerStudy}{cancer study ID (required)}
- \item{...}{Not used.}
-}
-\value{A data.frame with six columns:
-\enumerate{
-\item \var{genetic_profile_id}: a unique ID used to identify the genetic profile ID
-in subsequent interface calls. This is a human readable ID. For
-example, "gbm_tcga_mutations" identifies the TCGA GBM mutation genetic profile.
-\item \var{genetic_profile_name}: short profile name.
-\item \var{genetic_profile_description}: short profile description.
-\item \var{cancer_study_id}: cancer study ID tied to this genetic profile. Will
-match the input cancer_study_id.
-\item \var{genetic_alteration_type}: indicates the profile type. Will be one of:
-MUTATION, MUTATION_EXTENDED, COPY_NUMBER_ALTERATION, MRNA_EXPRESSION.
-\item \var{show_profile_in_analysis_tab}: a boolean flag used for internal purposes
-(you can safely ignore it).
-}
-}
-\author{
-
-}
-\references{
- cBio Cancer Genomics Portal: \url{http://www.cbioportal.org/}
-}
-\seealso{
-\code{\link{cgdsr}},\code{\link{CGDS}},\code{\link{getCancerStudies}},\code{\link{getCaseLists}},\code{\link{getProfileData}}
-}
-\examples{
-# Create CGDS object
-mycgds = CGDS("http://www.cbioportal.org/public-portal/")
-
-# Get list of cancer studys at server
-getCancerStudies(mycgds)
-
-# Get available case lists (collection of samples) for a given cancer study
-mycancerstudy = getCancerStudies(mycgds)[2,1]
-mycaselist = getCaseLists(mycgds,mycancerstudy)[1,1]
-
-# Get available genetic profiles
-mygeneticprofile = getGeneticProfiles(mycgds,mycancerstudy)[4,1]
-
-# Get data slices for a specified list of genes, genetic profile and case list
-getProfileData(mycgds,c('BRCA1','BRCA2'),mygeneticprofile,mycaselist)
-}
diff --git a/cgds-r/cgdsr/man/cgdsr-getMutationData.Rd b/cgds-r/cgdsr/man/cgdsr-getMutationData.Rd
deleted file mode 100644
index 093299cd178..00000000000
--- a/cgds-r/cgdsr/man/cgdsr-getMutationData.Rd
+++ /dev/null
@@ -1,60 +0,0 @@
-\name{cgdsr-getMutationData}
-\alias{cgdsr-getMutationData}
-\alias{getMutationData}
-\alias{getMutationData.CGDS}
-\title{Get mutation data for cancer study}
-\description{Queries the CGDS API and returns mutation data for
- a given case set and list of genes.}
-\usage{\method{getMutationData}{CGDS}(x, caseList, geneticProfile, genes, ...)}
-\arguments{
- \item{x}{A CGDS object (required)}
- \item{caseList}{A case list ID}
- \item{geneticProfile}{A genetic profile ID with mutation data}
- \item{genes}{A vector of query genes}
- \item{...}{Not used.}
-}
-\value{A data.frame with rows for each sample/case, rownames corresponding to
- case IDs, and columns corresponding to:
-\enumerate{
-\item \var{entrez_gene_id}: Entrez gene ID
-\item \var{gene_symbol}: HUGO gene symbol
-\item \var{sequencing_center}: Sequencer Center responsible for identifying this mutation.
-\item \var{mutation_status}: somatic or germline mutation status. all mutations returned will be of type somatic.
-\item \var{age_at_diagnosis}: Age at diagnosis.
-\item \var{mutation_type}: mutation type, such as nonsense, missense, or
-frameshift_ins.
-\item \var{validation_status}: validation status. Usually valid,
-invalid, or unknown.
-\item \var{amino_acid_change}: amino acid change resulting from the mutation.
-\item \var{functional_impact_score}: predicted functional impact score,
-as predicted by Mutation Assessor.
-\item \var{xvar_link}: Link to the Mutation Assessor web site.
-\item \var{xvar_link_pdb}: Link to the Protein Data Bank (PDB) View
-within Mutation Assessor web site.
-\item \var{xvar_link_msa}: Link the Multiple Sequence Alignment (MSA)
-view within the Mutation Assessor web site.
-\item \var{chr}: chromosome where mutation occurs.
-\item \var{start_position}: start position of mutation.
-\item \var{end_position}: end position of mutation
-}}
-\author{
-
-}
-\references{
- cBio Cancer Genomics Portal: \url{http://www.cbioportal.org/}
-}
-\seealso{
-\code{\link{cgdsr}},\code{\link{CGDS}}
-}
-\examples{
-# Create CGDS object
-mycgds = CGDS("http://www.cbioportal.org/public-portal/")
-
-getCancerStudies(mycgds)
-
-# Get available case lists (collection of samples) for a given cancer study
-# Get Extended Mutation Data for EGFR and PTEN in TCGA GBM
-#
-# getMutationData(mycgds,gbm_tcga_all,gbm_tcga_mutations,c('EGFR','PTEN'))
-
-}
diff --git a/cgds-r/cgdsr/man/cgdsr-getProfileData.Rd b/cgds-r/cgdsr/man/cgdsr-getProfileData.Rd
deleted file mode 100644
index 3fa2af9208e..00000000000
--- a/cgds-r/cgdsr/man/cgdsr-getProfileData.Rd
+++ /dev/null
@@ -1,70 +0,0 @@
-\name{cgdsr-getProfileData}
-\alias{cgdsr-getProfileData}
-\alias{getProfileData}
-\alias{getProfileData.CGDS}
-\title{Retrieves genomic profile data for genes and genetic profiles.}
-\description{Queries the CGDS API and returns data based on gene(s),
- genetic profile(s), and a case list.}
-\usage{\method{getProfileData}{CGDS}(x,genes,geneticProfiles,caseList,cases,caseIdsKey,...)}
-\arguments{
- \item{x}{A CGDS object (required)}
- \item{genes}{A vector of gene names or a String specifying a single gene (required)}
- \item{geneticProfiles}{ A vector of genetic profile IDs or String specifying
- a single genetic profile (required)}
- \item{caseList}{A case list ID}
- \item{cases}{A vector of case IDs)}
- \item{caseIdsKey}{Only used by web portal.}
- \item{...}{Not used.}
-}
-\value{
-
-When requesting one or multiple genes and a single genetic profile,
-the function returns a data.frame with genetic profile data in columns for each gene.
-
-When requesting a single gene and multiple genetic profiles,
-the function returns a data.frame containing columns with data for each genetic profile.
-
-Cases can be specified either through a case list ID, or a vector of
-case IDs.
-}
-\details{ Only one list is allowed, specify either a list of genes or
- genetic profiles. The format of the output data.frame depends on if
- a single or a list of genes was specified in the arguments.
-}
-\author{
-
-}
-\references{
- cBio Cancer Genomics Portal: \url{http://www.cbioportal.org/}
-}
-\seealso{
-\code{\link{cgdsr}},\code{\link{CGDS}},\code{\link{getCancerStudies}},\code{\link{getGeneticProfiles}},\code{\link{getCaseLists}}
-}
-\examples{
-# Create CGDS object
-mycgds = CGDS("http://www.cbioportal.org/public-portal/")
-
-# Get list of cancer studies at server
-getCancerStudies(mycgds)
-
-# Get available case lists (collection of samples) for a given cancer study
-mycancerstudy = getCancerStudies(mycgds)[2,1]
-mycaselist = getCaseLists(mycgds,mycancerstudy)[1,1]
-
-# Get available genetic profiles
-mygeneticprofile = getGeneticProfiles(mycgds,mycancerstudy)[4,1]
-
-# Get data slices for a specified list of genes, genetic profile and case list
-getProfileData(mycgds,c('BRCA1','BRCA2'),mygeneticprofile,mycaselist)
-
-# Get data slice for a single gene
-getProfileData(mycgds,'HMGA2',mygeneticprofile,mycaselist)
-
-# Get data slice for multiple genetic profiles and single gene
-getProfileData(mycgds,'HMGA2',getGeneticProfiles(mycgds,mycancerstudy)[c(3,4),1],mycaselist)
-
-# Get the same dataset from a vector of case IDs
-cases = unlist(strsplit(getCaseLists(mycgds,mycancerstudy)[1,'case_ids'],' '))
-getProfileData(mycgds,'HMGA2',getGeneticProfiles(mycgds,mycancerstudy)[c(3,4),1],cases=cases)
-
-}
diff --git a/cgds-r/cgdsr/man/cgdsr-package.Rd b/cgds-r/cgdsr/man/cgdsr-package.Rd
deleted file mode 100644
index e55f1444a89..00000000000
--- a/cgds-r/cgdsr/man/cgdsr-package.Rd
+++ /dev/null
@@ -1,82 +0,0 @@
-\name{cgdsr-package}
-\alias{cgdsr-package}
-\alias{cgdsr}
-\docType{package}
-\title{
-CGDS-R : a library for accessing data in the MSKCC Cancer Genomics Data Server
-(CGDS).
-}
-\description{
-The package provides a basic set of R functions for querying the Cancer
-Genomics Data Server (CGDS), hosted by the Computational Biology Center
-at Memorial-Sloan-Kettering Cancer Center (MSKCC). Read more about this
-service at the cBio Cancer Genomics Portal,
-\url{http://www.cbioportal.org/public-portal/}.
-}
-\details{
-\tabular{ll}{
-Package: \tab cgdsr\cr
-Type: \tab Package\cr
-License: \tab GPL\cr
-LazyLoad: \tab yes\cr
-}
-The Cancer Genomic Data Server (CGDS) web service interface provides
-direct programmatic access to all genomic data stored within the server.
-This package provides a basic set of R functions for querying the CGDS
-hosted by the Computational Biology Center at Memorial-Sloan-Kettering
-Cancer Center (MSKCC).
-
-The library can issue the following types of queries:
-
-\enumerate{
-\item \var{getCancerStudies()}: What cancer studies are hosted on the server?
-For example TCGA Glioblastoma or TCGA Ovarian cancer.
-\item \var{getGeneticProfiles()}: What genetic profile types are available for
-cancer study X? For example mRNA expression or copy number alterations.
-\item \var{getCaseLists()}: what case sets are available for cancer study X? For
-example all samples or only samples corresponding to a given cancer
-subtype.
-\item \var{getProfileData()}: Retrieve slices of genomic data. For
-example, a client can retrieve all mutation data from PTEN and EGFR in
-TCGA glioblastoma.
-\item \var{getClinicalData()}: Retrieve clinical data (e.g. patient
-survival time and age) for a given case list.
-}
-}
-\author{
-
-}
-\references{
- cBio Cancer Genomics Portal: \url{http://www.cbioportal.org/}
-}
-\keyword{ package }
-
-\seealso{
-\code{\link{CGDS}}, \code{\link{getCancerStudies}},
-\code{\link{getGeneticProfiles}}, \code{\link{getCaseLists}},
-\code{\link{getProfileData}}, \code{\link{getClinicalData}}.
-}
-
-\examples{
-# Create CGDS object
-mycgds = CGDS("http://www.cbioportal.org/public-portal/")
-
-# Test the CGDS endpoint URL using a few simple API tests
-test(mycgds)
-
-# Get list of cancer studies at server
-getCancerStudies(mycgds)
-
-# Get available case lists (collection of samples) for a given cancer study
-mycancerstudy = getCancerStudies(mycgds)[2,1]
-mycaselist = getCaseLists(mycgds,mycancerstudy)[1,1]
-
-# Get available genetic profiles
-mygeneticprofile = getGeneticProfiles(mycgds,mycancerstudy)[4,1]
-
-# Get data slices for a specified list of genes, genetic profile and case list
-getProfileData(mycgds,c('BRCA1','BRCA2'),mygeneticprofile,mycaselist)
-
-# Get clinical data for the case list
-myclinicaldata = getClinicalData(mycgds,mycaselist)
-}
diff --git a/cgds-r/cgdsr/man/cgdsr-plot.Rd b/cgds-r/cgdsr/man/cgdsr-plot.Rd
deleted file mode 100644
index 0a56ac24b56..00000000000
--- a/cgds-r/cgdsr/man/cgdsr-plot.Rd
+++ /dev/null
@@ -1,97 +0,0 @@
-\name{cgdsr-plot}
-\alias{cgdsr-plot}
-\alias{plot}
-\alias{plot.CGDS}
-\title{Generic plot function for CGDS API data.}
-\description{Queries the CGDS API and plots data for specified genes and genetic profiles.}
-\usage{\method{plot}{CGDS}(x,cancerStudy, genes, geneticProfiles,
-caseList, cases, caseIdsKey, skin, skin.normals, skin.col.gp, add.corr, legend.pos, ...)}
-\arguments{
- \item{x}{A CGDS object (required)}
- \item{cancerStudy}{cancer study ID (required)}
- \item{genes}{A vector of gene names or a String specifying a single gene (required)}
- \item{geneticProfiles}{ A vector of genetic profile IDs or String specifying
- a single genetic profile (required)}
- \item{caseList}{A case list ID}
- \item{cases}{A vector of case IDs)}
- \item{caseIdsKey}{Only used by web portal.}
- \item{skin}{A string specifying which plotting layout skin to use
- (default is continous data 'cont')}
- \item{skin.normals}{Specify a case list ID with normal samples, only
- some skins handle normal data.}
- \item{skin.col.gp}{Specify a vector of additional case list IDs to use
- for color coding of data points. Color coding is only handled by
- some skins.}
- \item{add.corr}{Computes correlation between the two data
- vectors. Specify correlation method ('pearson' or 'spearman') as
- argument.}
- \item{legend.pos}{Position of legend in plot (default is 'topright').}
- \item{...}{Not used.}
-}
-\details{Queries the CGDS API and plots data for specified genes and
- genetic profiles.
-
- The following combinations are allowed:
- \enumerate{
- \item 1 gene and 1 genetic profile. Plots genetic profile data histogram for specified gene.
- \item 2 genes and 1 genetic profile. Scatter plot of continuous genetic profile data for the two genes.
- \item 3 1 gene and 2 genetic profiles. Scatterplot or boxplot
- relating two genetic profile datasets for single gene.
- }
-
- The function currently implements the following skins:
- \enumerate{
- \item \var{cont}: This is the default skin. It treats all data as
- being continuous.
- \item \var{disc}: Requires a single gene and a single genetic
- profile. The genetic profile data is handled as a discrete dataset and
- barplot is returned.
- being continuous.
- \item \var{disc_cont}: Requires two genetic profiles. The first dataset is
- handled as being discrete data, and the function generates a boxplot
- with distributions for each level of the discrete genetic profile.
- \item \var{cna_mrna_mut}: This skin plots mRNA expression level as
- function of copy number status for a given gene. Data points are
- colored by mutation status if specified (\var{skin.col.gp}), and
- normal data points are included if specified (\var{skin.normals}).
- \item \var{cna_mrna_mut}: This skin plots mRNA expression level as
- function of DNA methylation status for a given gene. Data points are
- colored by copy number and mutation status if specified (two element
- vector of copy number and mutation genetic profiles specified for
- \var{skin.col.gp}). Normal data points are included if specified
- (\var{skin.normals}).
- }
-
-}
-\author{
-
-}
-\references{
- cBio Cancer Genomics Portal: \url{http://www.cbioportal.org/}
-}
-\seealso{
-\code{\link{cgdsr}},\code{\link{CGDS}},\code{\link{getCancerStudies}},\code{\link{getGeneticProfiles}},\code{\link{getProfileData}}
-}
-\examples{
-# Create CGDS object
-mycgds = CGDS("http://www.cbioportal.org/public-portal/")
-
-# Get list of cancer studies at server
-getCancerStudies(mycgds)
-
-# Get available case lists (collection of samples) for a given cancer study
-mycancerstudy = getCancerStudies(mycgds)[2,1]
-mycaselist = getCaseLists(mycgds,mycancerstudy)[1,1]
-
-# Get available genetic profiles
-mygeneticprofile = getGeneticProfiles(mycgds,mycancerstudy)[4,1]
-
-# histogram of genetic profile data for gene
-plot(mycgds,mycancerstudy,'MDM2',mygeneticprofile,mycaselist)
-
-# scatter plot of genetic profile data for two genes
-plot(mycgds,mycancerstudy,c('MDM2','MDM4'),mygeneticprofile,mycaselist)
-
-# See vignette for more details ...
-
-}
diff --git a/cgds-r/cgdsr/man/cgdsr-processURL.Rd b/cgds-r/cgdsr/man/cgdsr-processURL.Rd
deleted file mode 100644
index 33722864c73..00000000000
--- a/cgds-r/cgdsr/man/cgdsr-processURL.Rd
+++ /dev/null
@@ -1,15 +0,0 @@
-\name{cgdsr-processURL}
-\alias{cgdsr-processURL}
-\alias{processURL}
-\alias{processURL.CGDS}
-\title{Internal methods for CGDS library.}
-\description{These methods should not be invoked by the user.}
-\author{
-
-}
-\references{
- cBio Cancer Genomics Portal: \url{http://www.cbioportal.org/}
-}
-\seealso{
-\code{\link{cgdsr}},\code{\link{CGDS}}
-}
diff --git a/cgds-r/cgdsr/man/cgdsr-setPlotErrorMsg.Rd b/cgds-r/cgdsr/man/cgdsr-setPlotErrorMsg.Rd
deleted file mode 100644
index d347b05cd54..00000000000
--- a/cgds-r/cgdsr/man/cgdsr-setPlotErrorMsg.Rd
+++ /dev/null
@@ -1,33 +0,0 @@
-\name{cgdsr-setPlotErrorMsg}
-\alias{cgdsr-setPlotErrorMsg}
-\alias{setPlotErrorMsg}
-\alias{setPlotErrorMsg.CGDS}
-\title{Set custom plot error message}
-\description{Sets custom plot error message.}
-\usage{\method{setPlotErrorMsg}{CGDS}(x, msg, ...)}
-\arguments{
- \item{x}{A CGDS object (required)}
- \item{msg}{A custom message (string)}
- \item{...}{Not used.}
-}
-\author{
-
-}
-\references{
- cBio Cancer Genomics Portal: \url{http://www.cbioportal.org/public-portal/}
-}
-\seealso{
-\code{\link{cgdsr}},\code{\link{CGDS}}
-}
-\examples{
-# Create CGDS object
-mycgds = CGDS("http://www.cbioportal.org/public-portal/")
-
-getCancerStudies(mycgds)
-
-# Set custom error plot message
-setPlotErrorMsg(mycgds, 'My message ...')
-
-getCancerStudies(mycgds)
-
-}
diff --git a/cgds-r/cgdsr/man/cgdsr-setVerbose.Rd b/cgds-r/cgdsr/man/cgdsr-setVerbose.Rd
deleted file mode 100644
index 94a75c99792..00000000000
--- a/cgds-r/cgdsr/man/cgdsr-setVerbose.Rd
+++ /dev/null
@@ -1,33 +0,0 @@
-\name{cgdsr-setVerbose}
-\alias{cgdsr-setVerbose}
-\alias{setVerbose}
-\alias{setVerbose.CGDS}
-\title{Set verbose logging level for CGDS function calls}
-\description{Sets verbose logging level for CGDS function calls.}
-\usage{\method{setVerbose}{CGDS}(x, verbose, ...)}
-\arguments{
- \item{x}{A CGDS object (required)}
- \item{verbose}{Activate verbose logging (boolean)}
- \item{...}{Not used.}
-}
-\author{
-
-}
-\references{
- cBio Cancer Genomics Portal: \url{http://www.cbioportal.org/public-portal/}
-}
-\seealso{
-\code{\link{cgdsr}},\code{\link{CGDS}}
-}
-\examples{
-# Create CGDS object
-mycgds = CGDS("http://www.cbioportal.org/public-portal/")
-
-getCancerStudies(mycgds)
-
-# Activate verbose logging
-setVerbose(mycgds, TRUE)
-
-getCancerStudies(mycgds)
-
-}
diff --git a/cgds-r/cgdsr/man/cgdsr-test.Rd b/cgds-r/cgdsr/man/cgdsr-test.Rd
deleted file mode 100644
index 8266743a854..00000000000
--- a/cgds-r/cgdsr/man/cgdsr-test.Rd
+++ /dev/null
@@ -1,31 +0,0 @@
-\name{cgdsr-test}
-\alias{cgdsr-test}
-\alias{test}
-\alias{test.CGDS}
-\title{Simple test suite for CGDS object.}
-\description{Queries the CGDS API and returns results of the tests.}
-\usage{\method{test}{CGDS}(x, ...)}
-\arguments{
- \item{x}{A CGDS object.}
- \item{...}{Not used.}
-}
-\value{ Test results in text format.}
-\details{ A set of simple tests are evaluated. The format of the
- returned output from the following queries are tested: "getCancerStudies()",
- "getCaselists()", and "getGeneticProfiles()"}
-\author{
-
-}
-\references{
- cBio Cancer Genomics Portal: \url{http://www.cbioportal.org/public-portal/}
-}
-\seealso{
-\code{\link{cgdsr}},\code{\link{CGDS}}
-}
-\examples{
-# Create CGDS object
-mycgds = CGDS("http://www.cbioportal.org/public-portal/")
-
-# Run tests
-test(mycgds)
-}
diff --git a/cgds-r/misc/examples.R b/cgds-r/misc/examples.R
deleted file mode 100644
index 48bf4072943..00000000000
--- a/cgds-r/misc/examples.R
+++ /dev/null
@@ -1,78 +0,0 @@
-# package.skeleton(name="cgdsr",path="~/work/cgds/pkg/")
-
-### Status
-# x getCancerTypes=function()
-# x getGeneticProfiles=function(cancerTypeId)
-# x getCaseLists(cancerTypeId)
-# x getProfileData=function(geneList, geneticProfileId, caseSetId) {}
-## Not implemented
-# multiple calls ... single gene. getProfileData should handle this, Error when multiple genes and profiles.
-# getProfileDataBatch=function(geneList, geneticProfileIdList, caseSetId) {}
-# getMutationData(geneList, geneticProfileId, caseSetId)
-
-## TODO:
-# see if we can set baseURL as an object/class variable
-# look at error for testUrl, should be fine ...
-# getMutationData(geneList, geneticProfileId, caseSetId)
-
-### Questions,
-# * Should we do some kind of unit testing?
-# If done properly, this would require some a static working URL
-# and static meaningful types for some of the functions
-# * Some of the more specific functions, i.e. getSomaticMutationFrequency,
-# probably requires static types ...
-
-# Building
-#> R CMD check cgdsr
-#> R CMD build cgdsr
-
-# Installing
-
-#> R CMD INSTALL cgdsr_1.0.1.tar.gz
-
-#detach(package:cgdsr)
-library('cgdsr')
-ovc = CGDS("http://cbio.mskcc.org/cgds-public-ovarian/")
-cgdspub = CGDS("http://cbio.mskcc.org/cgds-public/")
-test(ovc) # very simple test of the web service
-test(cgdspub)
-#help(cgdsr)
-help(test)
-
-getCaseLists(ovc,'ova')
-getGeneticProfiles(ovc,'ova')
-x=getProfileData(ovc,c('BRCA1','BRCA2'),'ova_mrna_median','ova_all')
-y=getProfileData(ovc,'BRCA1',c('ova_mrna_median','ova_gistic'),'ova_all') # not implemented at the ova portal yet
-
-getCancerTypes(cgdspub)
-getCaseLists(cgdspub,'pca')
-getGeneticProfiles(cgdspub,'pca')
-x2=getProfileData(cgdspub,c('BRCA1','BRCA2'),'pca_mrna','pca_all') # works
-y2=getProfileData(cgdspub,'BRCA1',c('pca_cna','pca_mrna'),'pca_all') # works
-
-#scp ../index.html cgdsr_1.0.1.targ.z cbio:public_html/cgdsr/
-
-### Get top CNA loci in ovarian cancer
-
-getCaseLists(ovc,'ova') # 'ova_all'
-getGeneticProfiles(ovc,'ova')
-getProfileData(ovc,c('BRCA1'),c('ova_rae'),'ova_all')
-getProfileData(ovc,c('BRCA1','BRCA4','BRCA2'),c('ova_rae'),'ova_all')
-
-
-# test warnings and error messages
-
-getCaseLists(mycgds,'xxx') # OK, error no case lists
-getGeneticProfiles(mycgds,'xxx') # OK, error no genetic profiles
-getProfileData(mycgds,'NF13','gbm_mrna','gbm_all') # ! No warnings that NF13 could not be found, we cant handle warnings properly
-getProfileData(mycgds,'NF1','gbm_rna','gbm_all') # OK - error no genetic profiles
-getProfileData(mycgds,'NF1','gbm_mrna','gbm_ll') # ! error
-
-
-
-
-# Action items
-
-# how could we get data for many genes ... 'gene lists: i.e. Proteins, miRNAs, ncRNAs, All ...'
-# can we get CNA scores for individual loci ...
-
diff --git a/cgds-r/misc/regr.R b/cgds-r/misc/regr.R
deleted file mode 100755
index 2329f80d2ef..00000000000
--- a/cgds-r/misc/regr.R
+++ /dev/null
@@ -1,105 +0,0 @@
-#!/home/ajac/apps/bin/Rscript --no-save
-#options(warn=-99)
-
-library('ggplot2')
-library('cgdsapi')
-
-srv = CGDS("http://cbio.mskcc.org/cgds-public/",FALSE)
-#test(srv) # very simple test of the web service
-
-cancertype = commandArgs(TRUE)[1]
-gene1 = commandArgs(TRUE)[2]
-gene2 = commandArgs(TRUE)[3]
-
-datanames = data.frame(
- ovc = c('ova_all','ova_gistic','ova_mrna_median'),
- pca.primary = c('pca_primary','pca_cna','pca_mrna'),
- pca = c('pca_all','pca_cna','pca_mrna'))
-
-#print(cancertype)
-#print(colnames(data))
-
-if (!(cancertype %in% colnames(datanames))) {
- print("error: cancertype not defined")
- return()
-}
-
-case.set = datanames[1,cancertype]
-cna.type = datanames[2,cancertype]
-mrna.type = datanames[3,cancertype]
-
-###
-
-vp.layout <- function(x, y) viewport(layout.pos.row=x, layout.pos.col=y)
-arrange <- function(..., nrow=NULL, ncol=NULL, as.table=FALSE) {
- dots <- list(...)
- n <- length(dots)
- if(is.null(nrow) & is.null(ncol)) { nrow = floor(n/2) ; ncol = ceiling(n/nrow)}
- if(is.null(nrow)) { nrow = ceiling(n/ncol)}
- if(is.null(ncol)) { ncol = ceiling(n/nrow)}
- ## NOTE see n2mfrow in grDevices for possible alternative
-grid.newpage()
-pushViewport(viewport(layout=grid.layout(nrow,ncol) ) )
- ii.p <- 1
- for(ii.row in seq(1, nrow)){
- ii.table.row <- ii.row
- if(as.table) {ii.table.row <- nrow - ii.table.row + 1}
- for(ii.col in seq(1, ncol)){
- ii.table <- ii.p
- if(ii.p > n) break
- print(dots[[ii.table]], vp=vp.layout(ii.table.row, ii.col))
- ii.p <- ii.p + 1
- }
- }
-}
-
-# expression correlation of gene1 versus gene2, controlling for CNA status of gene2
-# plot1: gene1 vs gene2 expression scatter
-# plot2: gene2 CNA status
-# plot3: series of plots, gene1 vs gene2 expression scatter stratified by gene2 CNA status
-
-x1=data.frame(getProfileData(srv,c(gene1),mrna.type,case.set))
-x=data.frame(t(x1[,-c(1,2)]))
-
-x2=data.frame(getProfileData(srv,c(gene2),mrna.type,case.set))
-x=cbind(x,data.frame(t(x2[,-c(1,2)])))
-
-cna=data.frame(getProfileData(srv,gene2,cna.type,case.set))
-cna=data.frame(t(cna[,-c(1,2)]))
-x=cbind(x,cna)
-
-names(x) = c('gene1','gene2','gene2.CNA')
-
-x = x[!is.na(x[,3]),]# remove samples with NA
-
-theme_set(theme_bw())
-
-gene1vsgene2.scatter <- qplot(data=x, x=gene1, y=gene2, alpha=0.3,size=2) +
- stat_smooth(method='lm',se=FALSE,size=1,color='red') +
- xlab(paste(gene1,'expression'))+ylab(paste(gene2,'expression')) + opts(legend.position = "none");
-
-gene2CNA.boxplot <- qplot(data=x, x=factor(gene2.CNA), y=gene2,alpha=0) +
- geom_boxplot(alpha=1,outlier.size=0,outlier.colour='white') +
- geom_jitter(size=2,alpha=0.3,position=position_jitter(width=0.2)) +
- xlab(paste(gene2,'CNA status')) + ylab(paste(gene2,'expression')) + opts(legend.position = "none");
-
-gene1vsgene2.CNA.scatter <- qplot(data=x, x=gene1, y=gene2,alpha=0.3,size=2) +
- stat_smooth(method='lm',se=FALSE,size=1,color='red') + facet_wrap(~ gene2.CNA) +
- xlab(paste(gene1,'expression'))+ylab(paste(gene2,'expression')) + opts(legend.position = "none");
-
-
-pearson.test = cor.test(x$gene1,x$gene2)
-regr.test = summary(lm(gene2 ~ gene2.CNA + gene1, data = x))
-
-print(pearson.test)
-print(regr.test)
-
-pdf(paste(cancertype,'-',gene1,'-',gene2,'.pdf',sep=""))
-arrange(gene1vsgene2.scatter,gene2CNA.boxplot,gene1vsgene2.CNA.scatter,ncol=2)
-gp1=gpar(col="black", fontsize=8)
-px = 0.55; py = 0.45; offset = 0.03;
-grid.text(paste("Pearson r = ",sprintf("%.2f",pearson.test$estimate),", p-value = ",sprintf("%.1e",pearson.test$p.value)),px,py,gp=gp1,just='left')
-grid.text("Regression : ",px,py-1*offset,gp=gp1,just='left')
-grid.text(paste('CNA : z = ', sprintf("%.2f",regr.test$coefficients[2,3]),', p-value = ', sprintf("%.1e",regr.test$coefficients[2,4])),px,py-offset*2,gp=gp1,just='left')
-grid.text(paste(gene1, ' : z = ', sprintf("%.2f",regr.test$coefficients[3,3]),', p-value = ', sprintf("%.1e",regr.test$coefficients[3,4])),px,py-offset*3,gp=gp1,just='left')
-dev.off()
diff --git a/cgds-r/misc/test.R b/cgds-r/misc/test.R
deleted file mode 100644
index 25ef49788eb..00000000000
--- a/cgds-r/misc/test.R
+++ /dev/null
@@ -1,172 +0,0 @@
-setwd("/home/ajac/cvs/cgds-r/misc")
-
-library('cgdsr')
-
-## test miso DEV
-c = CGDS("http://miso-dev.cbio.mskcc.org:38080/gdac-portal/")
-
-pdf('test_brca_tp53.pdf')
-plot(c,'brca','TP53',c('brca_mrna','brca_RPPA_protein_level'),'brca_basal',skin='cna_mut',skin.col.gp=c('brca_gistic','brca_mutations'))
-plot(c,'brca','TP53',c('brca_mrna','brca_RPPA_protein_level'),'brca_basal',skin='cna_mut',skin.col.gp=c('brca_gistic','brca_mutations'),legend.pos='topleft')
-plot(c,'brca','TP53',c('brca_mrna','brca_RPPA_protein_level'),'brca_basal',skin='cna_mut',skin.col.gp=c('brca_gistic','brca_mutations'),legend.pos='bottom')
-plot(c,'brca','TP53',c('brca_mrna','brca_RPPA_protein_level'),'brca_basal',skin='cna_mut',skin.col.gp=c('brca_gistic','brca_mutations'),legend.pos='topleft',add.corr='pearson')
-plot(c,'brca','TP53',c('brca_mrna','brca_RPPA_protein_level'),'brca_basal',skin='cna_mut',skin.col.gp=c('brca_gistic','brca_mutations'),legend.pos='topleft',add.corr='spearman')
-plot(c,'brca','TP53',c('brca_mrna','brca_RPPA_protein_level'),'brca_luma',skin='cna_mut',skin.col.gp=c('brca_gistic','brca_mutations'),legend.pos='topleft',add.corr='pearson',)
-plot(c,'brca','TP53',c('brca_mrna','brca_RPPA_protein_level'),'brca_lumb',skin='cna_mut',skin.col.gp=c('brca_gistic','brca_mutations'),legend.pos='topleft',add.corr='pearson')
-dev.off()
-
-# tests for manuscript
-
-# Install R package and dependencies from CRAN
-> install.packages("cgdsr")
-
-# load library and establish connection to CGDS Web API
-> library(cgdsr)
-> mycgds = CGDS("http://www.cbioportal.org/public-portal/")
-
-# Browse CGDS-R package tutorial PDF
-> vignette("cgdsr")
-
-# Which cancer types are available?
-> getCancerStudies(mycgds)[, c(1, 2)]
-
-# Get DNA copy number and mRNA expression data for NF1 gene in GBM dataset
-# first three sample shown
-> getProfileData(mycgds, "NF1", c("gbm_cna_rae", "gbm_mrna"), "gbm_all")
-
-# Use CGDS-R plot function to plot DNA CNA and mRNA exp. data for NF1
-> plot(mycgds, "gbm", "NF1", c("gbm_cna_rae", "gbm_mrna"),
- "gbm_all", skin = "disc_cont")
-
-# Scatter plot of MDM2 and MDM4 expression levels in GBM dataset
-> plot(mycgds, "gbm", c("MDM2", "MDM4"), "gbm_mrna", "gbm_all")
-
-
-####
-####
-####
-
-# test public portal
-c = CGDS("http://www.cbioportal.org/public-portal/")
-test(c)
-
-pdf('test_plots_public_portal.pdf')
-plot(c, 'gbm_tcga', 'MDM2', 'gbm_tcga_mrna', 'gbm_tcga_all') # skin = 'cont'
-plot(c, 'gbm_tcga', 'MDM2', 'gbm_tcga_mrna', 'gbm_tcga_all',skin = 'cont')
-plot(c, 'ov_tcga', 'MDM2', 'ov_tcga_mrna_median', 'ov_tcga_all')
-plot(c, 'ov_tcga', 'BRCA1', 'ov_tcga_mrna_median', cases=c('TCGA-61-2109','TCGA-61-2110','TCGA-61-2111'))
-plot(c, 'gbm_tcga', 'MDM2', 'gbm_tcga_cna_rae', 'gbm_tcga_all', skin = 'disc')
-
-plot(c, 'gbm_tcga', c('MDM2','MDM4'), 'gbm_tcga_mrna' , 'gbm_tcga_all')
-#plot(c, 'ov_tcga', c('hsa-miR-29a','DNMT3A'), 'ov_tcga_mrna_median' , 'ov_tcga_all')
-
-plot(c, 'gbm_tcga', c('NF1'), c('gbm_tcga_cna_rae','gbm_tcga_mrna'), 'gbm_tcga_all', skin = 'disc_cont')
-plot(c, 'ov_tcga', c('BRCA2'), c('ov_tcga_gistic','ov_tcga_mrna_median'), 'ov_tcga_all', skin = 'disc_cont')
-#plot(c, 'ov_tcga', c('hsa-miR-29a'), c('ova_rae','ov_tcga_mrna_median'), 'ova_all', skin = 'disc_cont')
-
-plot(c, 'ov_tcga', 'TP53', c('ov_tcga_gistic','ov_tcga_mrna_median'), 'ov_tcga_all', skin = 'cna_mrna_mut' , skin.col.gp='ov_tcga_mutations')
-
-## plot(c, 'ov_tcga', 'BRCA2', c('ova_rae','ov_tcga_mrna_median'), 'ova_all', skin = 'cna_mrna_mut')
-## plot(c, 'ov_tcga', 'BRCA2', c('ova_rae','ov_tcga_mrna_median'), 'ova_all', skin = 'cna_mrna_mut', skin.normals='ova_normal_mrna')
-## plot(c, 'ov_tcga', 'CCNE1', c('ova_rae','ov_tcga_mrna_median'), 'ova_all', skin = 'cna_mrna_mut', skin.normals='ova_normal_mrna')
-## plot(c, 'ov_tcga', 'hsa-miR-29a', c('ova_rae','ov_tcga_mrna_median'), 'ova_all', skin = 'cna_mrna_mut', skin.normals='ova_normal_mrna')
-## plot(c, 'ov_tcga', 'BRCA2', c('ova_rae','ov_tcga_mrna_median'), 'ova_all', skin = 'cna_mrna_mut', skin.col.gp='ova_mutations_next_gen')
-## plot(c, 'ov_tcga', 'BRCA2', c('ova_rae','ov_tcga_mrna_median'), 'ova_all', skin = 'cna_mrna_mut', skin.normals='ova_normal_mrna', skin.col.gp='ova_mutations_next_gen')
-## plot(c, 'ov_tcga', 'TP53', c('ova_rae','ov_tcga_mrna_median'), 'ova_all', skin = 'cna_mrna_mut', skin.col.gp='ova_mutations_next_gen')
-## plot(c, 'ov_tcga', 'TP53', c('ova_rae','ov_tcga_mrna_median'), 'ova_all', skin = 'cna_mrna_mut', skin.normals='ova_normal_mrna', skin.col.gp='ova_mutations_next_gen')
-## plot(c, 'ov_tcga', 'hsa-miR-29a', c('ova_rae','ov_tcga_mrna_median'), 'ova_all', skin = 'cna_mrna_mut', skin.col.gp='ova_mutations_next_gen')
-## plot(c, 'ov_tcga', 'hsa-miR-29a', c('ova_rae','ov_tcga_mrna_median'), 'ova_all', skin = 'cna_mrna_mut', skin.normals='ova_normal_mrna', skin.col.gp='ova_mutations_next_gen')
-
-## plot(c, 'ov_tcga', 'BRCA1', c('ova_methylation','ov_tcga_mrna_median'), 'ova_all', skin = 'meth_mrna_cna_mut', skin.col.gp=c('ova_rae','ova_mutations_next_gen'))
-## plot(c, 'ov_tcga', 'BRCA1', c('ova_methylation','ov_tcga_mrna_median'), 'ova_all', skin = 'meth_mrna_cna_mut', skin.normals='ova_normal_mrna', skin.col.gp=c('ova_rae','ova_mutations_next_gen'))
-## plot(c, 'ov_tcga', 'BRCA2', c('ova_methylation','ov_tcga_mrna_median'), 'ova_all', skin = 'meth_mrna_cna_mut', skin.col.gp=c('ova_rae','ova_mutations_next_gen'))
-## plot(c, 'ov_tcga', 'BRCA2', c('ova_methylation','ov_tcga_mrna_median'), 'ova_all', skin = 'meth_mrna_cna_mut', skin.normals='ova_normal_mrna', skin.col.gp=c('ova_rae','ova_mutations_next_gen'))
-
-# testcase for skin cna_mut, should not limit x axis to range 0-1, we also test switching axis
-plot(c, 'ov_tcga', 'BRCA2', c('ova_methylation','ov_tcga_mrna_median'), 'ova_all', skin = 'cna_mut', skin.normals='ova_normal_mrna', skin.col.gp=c('ova_rae','ova_mutations_next_gen'))
-plot(c, 'ov_tcga', 'BRCA2', c('ov_tcga_mrna_median','ova_methylation'), 'ova_all', skin = 'cna_mut', skin.normals='ova_normal_mrna', skin.col.gp=c('ova_rae','ova_mutations_next_gen'))
-
-# testcase with no expression data for normal cases
-plot(c, 'ov_tcga', 'BRCA1', c('ova_rae','ov_tcga_mrna_median_Zscores'), 'ova_all', skin = 'cna_mrna_mut', skin.normals='ova_normal_mrna')
-# testcase with no expression data for normal cases
-plot(c, 'ov_tcga', 'BRCA1', c('ova_methylation','ov_tcga_mrna_median_Zscores'), 'ova_all', skin = 'meth_mrna_cna_mut', skin.normals='ova_normal_mrna', skin.col.gp=c('ova_rae','ova_mutations_next_gen'))
-
-# some tests of plotting with limited data
-plot(c, 'ov_tcga', 'TP53', c('ov_tcga_rae','ov_tcga_mrna_median'), cases=c('TCGA-04-1331','TCGA-04-1336'), skin = 'cna_mrna_mut', skin.normals='ov_tcga_normal_mrna', skin.col.gp='ov_tcga_mutations')
-plot(c, 'ov_tcga', 'TP53', c('ov_tcga_rae','ov_tcga_mrna_median'), cases=c('TCGA-04-1331','TCGA-04-1332','TCGA-04-1336'), skin = 'cna_mrna_mut', skin.col.gp='ov_tcga_mutations')
-plot(c, 'ov_tcga', 'TP53', c('ov_tcga_rae','ov_tcga_mrna_median'), cases=c('TCGA-04-1331','TCGA-04-1336'), skin = 'cna_mrna_mut', skin.col.gp='ov_tcga_mutations')
-plot(c, 'ov_tcga', 'TP53', c('ov_tcga_rae','ov_tcga_mrna_median'), cases=c('TCGA-04-1331'), skin = 'cna_mrna_mut', skin.col.gp='ov_tcga_mutations')
-
-dev.off()
-
-###
-# get mutation data
-
-getMutationData(c,'gbm_tcga_all','gbm_tcga_mutations',c('EGFR','PTEN'))
-
-# sarcoma testcase with NAs in mRNA data
-#plot(c, 'mskcc_broad_sarc', c('CBFA2T3'), c('Sarc_cna','Sarc_mrna'), 'Sarc_all', skin='cna_mrna_mut' , skin.col.gp=c('Sarc_mutations'), skin.normals='Sarc_normal')
-
-####
-
-# gbm test case
-c = CGDS("http://www.cbioportal.org/public-portal/")
-test(c)
-setVerbose(c,1)
-setPlotErrorMsg(c,"Some text here ...")
-
-plot(c, 'gbm_tcga', c('EGFR'), c('gbm_tcga_gistic','gbm_tcga_mrna'), cases=c("TCGA-12-0772","TCGA-06-6700"), skin='cna_mrna_mut' , skin.col.gp=c('gbm_tcga_mutations'));
-
-plot(c, 'gbm_tcga', c('EGFR'), c('gbm_tcga_gistic','gbm_tcga_mrna'), cases=c("TCGA-12-0772","TCGA-06-6700"), skin='cna_mrna_mut' , skin.col.gp='gbm_tcga_mutations');
-
-plot(c, 'gbm_tcga', c('EGFR'), c('gbm_tcga_gistic'), cases=c("TCGA-12-0772","TCGA-06-6700"));
-
-getProfileData(c, c('EGFR'), 'gbm_tcga_mutations', cases=c("TCGA-12-0772","TCGA-06-6700"))
-getProfileData(c, c('EGFR'), 'gbm_tcga_mutations', 'gbm_tcga_all')
-
-getProfileData(c, c('TP53'), 'ov_tcga_mutations', cases=c("TCGA-12-0772","TCGA-06-6700"))
-
-####
-
-pdf('test_firehose.pdf')
-c = CGDS("http://buri.cbio.mskcc.org:38080/cgds_tcga_internal_portal/")
-# test for mutations, splice site mutation (e...2) should not be detected different from missense (e23f)
-plot(c, 'gbm_tcga', 'RANBP6', c('gbm_gistic','gbm_tcga_mrna_median_Zscores'), 'gbm_3way_complete', skin = 'cna_mrna_mut',skin.col.gp=c('gbm_mutations'))
-dev.off()
-
-# The following test should all raise errors
-
-pdf('test_errors.pdf')
-
-plot(c, 'gbm_tcga', 'MDM2', c('gbm_tcga_mrna','gbm_tcga_mrna'), 'gbm_tcga_all', skin = 'disc')
-plot(c, 'gbm_tcga', c('MDM2','MDM2','MDM2'), 'gbm_tcga_mrna', 'gbm_tcga_all')
-plot(c, 'gbm_tcga', c('MDM2','MDM2'), c('gbm_tcga_mrna','gbm_tcga_mrna'), 'gbm_tcga_all')
-
-# unknown gene
-plot(c, 'gbm_tcga', 'MDM12', 'gbm_tcga_mrna', 'gbm_tcga_all')
-
-# unknown case set id
-plot(c, 'gbm_tcga', 'MDM2', 'gbm_tcga_mrna', 'xxx')
-
-# this should be a text error, has to be fixed at WEB API
-#plot2(c, 'gbm_tcga', 'MDM12', c('gbm_tcga_mrna','gbm_tcga_mrna'), 'cont', 'gbm_tcga_all')
-
-# wrong IDs
-plot(c, 'gbm_tcga', 'DNMT3A', c('ova_ra','ov_tcga_mrna_median'), 'ova_all')
-plot(c, 'gbm_tcga', c('hsa-miR-29A','DNMT3A'), 'ov_tcga_mrna_median', 'ova_all')
-plot(c, 'gbm_tcga', c('DNMT3B','DNMT3D'), 'ov_tcga_mrna_median', 'ova_all')
-plot(c, 'gbm_tcga', c('DNMT3C','DNMT3D'), 'ov_tcga_mrna_median', 'ova_all')
-
-# wrong caseList
-plot(c, 'gbm_tcga', 'MDM2', 'gbm_tcga_mrna', 'cont', 'gbm_tcga_alls')
-
-# no data for combination of gene and genetic profile
-plot(c,'ov_tcga',c('BRCA1','CCNE1'),'ova_methylation','ova_all')
-plot(c,'ov_tcga','CCNE1',c('ov_tcga_mrna_median','ova_methylation'),'ova_all')
-plot(c,'ov_tcga','CCNE1',c('ova_methylation'),'ova_all')
-plot(c,'ov_tcga',c('CCNE1','hsa-miR-21'),c('ova_methylation'),'ova_all')
-
-# unknown skin ID
-plot(c, 'gbm_tcga', 'MDM2', 'gbm_tcga_mrna', 'gbm_tcga_all', skin = 'xxx')
-
-dev.off()
diff --git a/cgds-r/web/cgdsr.pdf b/cgds-r/web/cgdsr.pdf
deleted file mode 100644
index 8227d94ed7c..00000000000
Binary files a/cgds-r/web/cgdsr.pdf and /dev/null differ
diff --git a/cgds-r/web/index.html b/cgds-r/web/index.html
deleted file mode 100644
index 63ed98cbb34..00000000000
--- a/cgds-r/web/index.html
+++ /dev/null
@@ -1,62 +0,0 @@
-
-
-CGDS-R package
-
-
-
-
-
CGDS-R package
-
-
-Get the most recent version of the R package here.
-The CGDS-R documentation manual: cgdsr_manual.pdf.
-The CGDS-R documentation vignette: cgdsr.pdf.
-
-
-
Installation
-
-Make sure that you have installed the 'R.oo' package, in an R shell:
-
-
-> install.packages('R.oo')
-
-
-Then install the cgds-R package in a unix shell:
-
-
-R CMD INSTALL cgdsr_1.0.14.tar.gz
-
-
-
-
Example usage
-
-
-library('cgdsr')
-
-mycgds = CGDS("http://cbio.mskcc.org/cgds-public/")
-
-# basic server API tests
-test(mycgds)
-
-# get list of cancer types at server
-getCancerTypes(mycgds)
-
-# get available case lists (collection of samples) for a given cancer type
-mycancertype = getCancerTypes(mycgds)[1,1]
-mycaselist = getCaseLists(mycgds,mycancertype)[1,1]
-
-# get available genetic profiles
-mygeneticprofile = getGeneticProfiles(mycgds,mycancertype)[4,1]
-
-# get data for a specified list of genes, datatypes and case list
-getProfileData(mycgds,c('BRCA1','BRCA2'),mygeneticprofile,mycaselist)
-
-# documentation
-help('cgdsr')
-help('CGDS')
-
-
- *
- *
- */
-@XmlAccessorType(XmlAccessType.FIELD)
-@XmlType(name = "CasesType", propOrder = {
- "_case"
-})
-public class CasesType {
-
- @XmlElement(name = "Case")
- protected List _case;
-
- /**
- * Gets the value of the case property.
- *
- *
- * This accessor method returns a reference to the live list,
- * not a snapshot. Therefore any modification you make to the
- * returned list will be present inside the JAXB object.
- * This is why there is not a set method for the case property.
- *
- *
- * For example, to add a new item, do as follows:
- *
- * getCase().add(newItem);
- *
- *
- *
- *
- * Objects of the following type(s) are allowed in the list
- * {@link CaseType }
- *
- *
- */
- public List getCase() {
- if (_case == null) {
- _case = new ArrayList();
- }
- return this._case;
- }
-
-}
diff --git a/importer/src/main/java/org/mskcc/cbio/foundation/jaxb/ClientCaseInfoType.java b/importer/src/main/java/org/mskcc/cbio/foundation/jaxb/ClientCaseInfoType.java
deleted file mode 100644
index c5decd061a5..00000000000
--- a/importer/src/main/java/org/mskcc/cbio/foundation/jaxb/ClientCaseInfoType.java
+++ /dev/null
@@ -1,69 +0,0 @@
-//
-// This file was generated by the JavaTM Architecture for XML Binding(JAXB) Reference Implementation, v2.2.4-2
-// See http://java.sun.com/xml/jaxb
-// Any modifications to this file will be lost upon recompilation of the source schema.
-// Generated on: 2014.08.05 at 10:50:18 AM EDT
-//
-
-
-package org.mskcc.cbio.foundation.jaxb;
-
-import javax.xml.bind.annotation.XmlAccessType;
-import javax.xml.bind.annotation.XmlAccessorType;
-import javax.xml.bind.annotation.XmlElement;
-import javax.xml.bind.annotation.XmlType;
-
-
-/**
- *
Java class for ClientCaseInfoType complex type.
- *
- *
The following schema fragment specifies the expected content contained within this class.
- *
- *
- *
- *
- */
-@XmlAccessorType(XmlAccessType.FIELD)
-@XmlType(name = "copy-number-alterationsType", propOrder = {
- "content"
-})
-public class CopyNumberAlterationsType {
-
- @XmlElementRef(name = "copy-number-alteration", type = JAXBElement.class, required = false)
- @XmlMixed
- protected List content;
-
- /**
- * Gets the value of the content property.
- *
- *
- * This accessor method returns a reference to the live list,
- * not a snapshot. Therefore any modification you make to the
- * returned list will be present inside the JAXB object.
- * This is why there is not a set method for the content property.
- *
- *
- * For example, to add a new item, do as follows:
- *
- * getContent().add(newItem);
- *
- *
- *
- *
- * Objects of the following type(s) are allowed in the list
- * {@link String }
- * {@link JAXBElement }{@code <}{@link CopyNumberAlterationType }{@code >}
- *
- *
- */
- public List getContent() {
- if (content == null) {
- content = new ArrayList();
- }
- return this.content;
- }
-
-}
diff --git a/importer/src/main/java/org/mskcc/cbio/foundation/jaxb/MetricType.java b/importer/src/main/java/org/mskcc/cbio/foundation/jaxb/MetricType.java
deleted file mode 100644
index 606e8166492..00000000000
--- a/importer/src/main/java/org/mskcc/cbio/foundation/jaxb/MetricType.java
+++ /dev/null
@@ -1,175 +0,0 @@
-//
-// This file was generated by the JavaTM Architecture for XML Binding(JAXB) Reference Implementation, v2.2.4-2
-// See http://java.sun.com/xml/jaxb
-// Any modifications to this file will be lost upon recompilation of the source schema.
-// Generated on: 2014.08.05 at 10:50:18 AM EDT
-//
-
-
-package org.mskcc.cbio.foundation.jaxb;
-
-import javax.xml.bind.annotation.XmlAccessType;
-import javax.xml.bind.annotation.XmlAccessorType;
-import javax.xml.bind.annotation.XmlAttribute;
-import javax.xml.bind.annotation.XmlType;
-import javax.xml.bind.annotation.XmlValue;
-
-
-/**
- *
Java class for metricType complex type.
- *
- *
The following schema fragment specifies the expected content contained within this class.
- *
- *
- *
- *
- */
-@XmlAccessorType(XmlAccessType.FIELD)
-@XmlType(name = "metricsType", propOrder = {
- "metric"
-})
-public class MetricsType {
-
- protected List metric;
-
- /**
- * Gets the value of the metric property.
- *
- *
- * This accessor method returns a reference to the live list,
- * not a snapshot. Therefore any modification you make to the
- * returned list will be present inside the JAXB object.
- * This is why there is not a set method for the metric property.
- *
- *
- * For example, to add a new item, do as follows:
- *
- * getMetric().add(newItem);
- *
- *
- *
- *
- * Objects of the following type(s) are allowed in the list
- * {@link MetricType }
- *
- *
- */
- public List getMetric() {
- if (metric == null) {
- metric = new ArrayList();
- }
- return this.metric;
- }
-
-}
diff --git a/importer/src/main/java/org/mskcc/cbio/foundation/jaxb/ObjectFactory.java b/importer/src/main/java/org/mskcc/cbio/foundation/jaxb/ObjectFactory.java
deleted file mode 100644
index d97ab4cd32a..00000000000
--- a/importer/src/main/java/org/mskcc/cbio/foundation/jaxb/ObjectFactory.java
+++ /dev/null
@@ -1,219 +0,0 @@
-//
-// This file was generated by the JavaTM Architecture for XML Binding(JAXB) Reference Implementation, v2.2.4-2
-// See http://java.sun.com/xml/jaxb
-// Any modifications to this file will be lost upon recompilation of the source schema.
-// Generated on: 2014.08.05 at 10:50:18 AM EDT
-//
-
-
-package org.mskcc.cbio.foundation.jaxb;
-
-import javax.xml.bind.JAXBElement;
-import javax.xml.bind.annotation.XmlElementDecl;
-import javax.xml.bind.annotation.XmlRegistry;
-import javax.xml.namespace.QName;
-
-
-/**
- * This object contains factory methods for each
- * Java content interface and Java element interface
- * generated in the org.foundation.data package.
- *
An ObjectFactory allows you to programatically
- * construct new instances of the Java representation
- * for XML content. The Java representation of XML
- * content can consist of schema derived interfaces
- * and classes representing the binding of schema
- * type definitions, element declarations and model
- * groups. Factory methods for each of these are
- * provided in this class.
- *
- */
-@XmlRegistry
-public class ObjectFactory {
-
- private final static QName _ClientCaseInfo_QNAME = new QName("", "ClientCaseInfo");
- private final static QName _VariantReportTypeRearrangementsRearrangement_QNAME = new QName("", "rearrangement");
- private final static QName _SampleTypeComment_QNAME = new QName("", "comment");
- private final static QName _CopyNumberAlterationsTypeCopyNumberAlteration_QNAME = new QName("", "copy-number-alteration");
-
- /**
- * Create a new ObjectFactory that can be used to create new instances of schema derived classes for package: org.foundation.data
- *
- */
- public ObjectFactory() {
- }
-
- /**
- * Create an instance of {@link VariantReportType }
- *
- */
- public VariantReportType createVariantReportType() {
- return new VariantReportType();
- }
-
- /**
- * Create an instance of {@link ClientCaseInfoType }
- *
- */
- public ClientCaseInfoType createClientCaseInfoType() {
- return new ClientCaseInfoType();
- }
-
- /**
- * Create an instance of {@link CopyNumberAlterationsType }
- *
- */
- public CopyNumberAlterationsType createCopyNumberAlterationsType() {
- return new CopyNumberAlterationsType();
- }
-
- /**
- * Create an instance of {@link CasesType }
- *
- */
- public CasesType createCasesType() {
- return new CasesType();
- }
-
- /**
- * Create an instance of {@link CaseType }
- *
- */
- public CaseType createCaseType() {
- return new CaseType();
- }
-
- /**
- * Create an instance of {@link RearrangementsType }
- *
- */
- public RearrangementsType createRearrangementsType() {
- return new RearrangementsType();
- }
-
- /**
- * Create an instance of {@link SamplesType }
- *
- */
- public SamplesType createSamplesType() {
- return new SamplesType();
- }
-
- /**
- * Create an instance of {@link MetricsType }
- *
- */
- public MetricsType createMetricsType() {
- return new MetricsType();
- }
-
- /**
- * Create an instance of {@link QualityControlType }
- *
- */
- public QualityControlType createQualityControlType() {
- return new QualityControlType();
- }
-
- /**
- * Create an instance of {@link ShortVariantsType }
- *
- */
- public ShortVariantsType createShortVariantsType() {
- return new ShortVariantsType();
- }
-
- /**
- * Create an instance of {@link MetricType }
- *
- */
- public MetricType createMetricType() {
- return new MetricType();
- }
-
- /**
- * Create an instance of {@link ShortVariantType }
- *
- */
- public ShortVariantType createShortVariantType() {
- return new ShortVariantType();
- }
-
- /**
- * Create an instance of {@link RearrangementType }
- *
- */
- public RearrangementType createRearrangementType() {
- return new RearrangementType();
- }
-
- /**
- * Create an instance of {@link SampleType }
- *
- */
- public SampleType createSampleType() {
- return new SampleType();
- }
-
- /**
- * Create an instance of {@link CopyNumberAlterationType }
- *
- */
- public CopyNumberAlterationType createCopyNumberAlterationType() {
- return new CopyNumberAlterationType();
- }
-
- /**
- * Create an instance of {@link VariantReportType.Rearrangements }
- *
- */
- public VariantReportType.Rearrangements createVariantReportTypeRearrangements() {
- return new VariantReportType.Rearrangements();
- }
-
- /**
- * Create an instance of {@link JAXBElement }{@code <}{@link ClientCaseInfoType }{@code >}}
- *
- */
- @XmlElementDecl(namespace = "", name = "ClientCaseInfo")
- public JAXBElement createClientCaseInfo(ClientCaseInfoType value) {
- return new JAXBElement(_ClientCaseInfo_QNAME, ClientCaseInfoType.class, null, value);
- }
-
- /**
- * Create an instance of {@link JAXBElement }{@code <}{@link RearrangementType }{@code >}}
- *
- */
- @XmlElementDecl(namespace = "", name = "rearrangement", scope = VariantReportType.Rearrangements.class)
- public JAXBElement createVariantReportTypeRearrangementsRearrangement(RearrangementType value) {
- return new JAXBElement(_VariantReportTypeRearrangementsRearrangement_QNAME, RearrangementType.class, VariantReportType.Rearrangements.class, value);
- }
-
- /**
- * Create an instance of {@link JAXBElement }{@code <}{@link String }{@code >}}
- *
- */
- @XmlElementDecl(namespace = "", name = "comment", scope = SampleType.class)
- public JAXBElement createSampleTypeComment(String value) {
- return new JAXBElement(_SampleTypeComment_QNAME, String.class, SampleType.class, value);
- }
-
- /**
- * Create an instance of {@link JAXBElement }{@code <}{@link CopyNumberAlterationType }{@code >}}
- *
- */
- @XmlElementDecl(namespace = "", name = "copy-number-alteration", scope = CopyNumberAlterationsType.class)
- public JAXBElement createCopyNumberAlterationsTypeCopyNumberAlteration(CopyNumberAlterationType value) {
- return new JAXBElement(_CopyNumberAlterationsTypeCopyNumberAlteration_QNAME, CopyNumberAlterationType.class, CopyNumberAlterationsType.class, value);
- }
-
- /**
- * Create an instance of {@link JAXBElement }{@code <}{@link String }{@code >}}
- *
- */
- @XmlElementDecl(namespace = "", name = "comment", scope = RearrangementType.class)
- public JAXBElement createRearrangementTypeComment(String value) {
- return new JAXBElement(_SampleTypeComment_QNAME, String.class, RearrangementType.class, value);
- }
-
-}
diff --git a/importer/src/main/java/org/mskcc/cbio/foundation/jaxb/QualityControlType.java b/importer/src/main/java/org/mskcc/cbio/foundation/jaxb/QualityControlType.java
deleted file mode 100644
index 1491c951df7..00000000000
--- a/importer/src/main/java/org/mskcc/cbio/foundation/jaxb/QualityControlType.java
+++ /dev/null
@@ -1,97 +0,0 @@
-//
-// This file was generated by the JavaTM Architecture for XML Binding(JAXB) Reference Implementation, v2.2.4-2
-// See http://java.sun.com/xml/jaxb
-// Any modifications to this file will be lost upon recompilation of the source schema.
-// Generated on: 2014.08.05 at 10:50:18 AM EDT
-//
-
-
-package org.mskcc.cbio.foundation.jaxb;
-
-import javax.xml.bind.annotation.XmlAccessType;
-import javax.xml.bind.annotation.XmlAccessorType;
-import javax.xml.bind.annotation.XmlAttribute;
-import javax.xml.bind.annotation.XmlElement;
-import javax.xml.bind.annotation.XmlType;
-
-
-/**
- *
Java class for quality-controlType complex type.
- *
- *
The following schema fragment specifies the expected content contained within this class.
- *
- *
- * This accessor method returns a reference to the live list,
- * not a snapshot. Therefore any modification you make to the
- * returned list will be present inside the JAXB object.
- * This is why there is not a set method for the content property.
- *
- *
- * For example, to add a new item, do as follows:
- *
- * getContent().add(newItem);
- *
- *
- *
- *
- * Objects of the following type(s) are allowed in the list
- * {@link String }
- * {@link JAXBElement }{@code <}{@link String }{@code >}
- *
- *
- */
- public List getContent() {
- if (content == null) {
- content = new ArrayList();
- }
- return this.content;
- }
-
- /**
- * Gets the value of the description property.
- *
- * @return
- * possible object is
- * {@link String }
- *
- */
- public String getDescription() {
- return description;
- }
-
- /**
- * Sets the value of the description property.
- *
- * @param value
- * allowed object is
- * {@link String }
- *
- */
- public void setDescription(String value) {
- this.description = value;
- }
-
- /**
- * Gets the value of the inFrame property.
- *
- * @return
- * possible object is
- * {@link String }
- *
- */
- public String getInFrame() {
- return inFrame;
- }
-
- /**
- * Sets the value of the inFrame property.
- *
- * @param value
- * allowed object is
- * {@link String }
- *
- */
- public void setInFrame(String value) {
- this.inFrame = value;
- }
-
- /**
- * Gets the value of the otherGene property.
- *
- * @return
- * possible object is
- * {@link String }
- *
- */
- public String getOtherGene() {
- return otherGene;
- }
-
- /**
- * Sets the value of the otherGene property.
- *
- * @param value
- * allowed object is
- * {@link String }
- *
- */
- public void setOtherGene(String value) {
- this.otherGene = value;
- }
-
- /**
- * Gets the value of the pos1 property.
- *
- * @return
- * possible object is
- * {@link String }
- *
- */
- public String getPos1() {
- return pos1;
- }
-
- /**
- * Sets the value of the pos1 property.
- *
- * @param value
- * allowed object is
- * {@link String }
- *
- */
- public void setPos1(String value) {
- this.pos1 = value;
- }
-
- /**
- * Gets the value of the pos2 property.
- *
- * @return
- * possible object is
- * {@link String }
- *
- */
- public String getPos2() {
- return pos2;
- }
-
- /**
- * Sets the value of the pos2 property.
- *
- * @param value
- * allowed object is
- * {@link String }
- *
- */
- public void setPos2(String value) {
- this.pos2 = value;
- }
-
- /**
- * Gets the value of the status property.
- *
- * @return
- * possible object is
- * {@link String }
- *
- */
- public String getStatus() {
- return status;
- }
-
- /**
- * Sets the value of the status property.
- *
- * @param value
- * allowed object is
- * {@link String }
- *
- */
- public void setStatus(String value) {
- this.status = value;
- }
-
- /**
- * Gets the value of the supportingReadPairs property.
- *
- * @return
- * possible object is
- * {@link Short }
- *
- */
- public Short getSupportingReadPairs() {
- return supportingReadPairs;
- }
-
- /**
- * Sets the value of the supportingReadPairs property.
- *
- * @param value
- * allowed object is
- * {@link Short }
- *
- */
- public void setSupportingReadPairs(Short value) {
- this.supportingReadPairs = value;
- }
-
- /**
- * Gets the value of the targetedGene property.
- *
- * @return
- * possible object is
- * {@link String }
- *
- */
- public String getTargetedGene() {
- return targetedGene;
- }
-
- /**
- * Sets the value of the targetedGene property.
- *
- * @param value
- * allowed object is
- * {@link String }
- *
- */
- public void setTargetedGene(String value) {
- this.targetedGene = value;
- }
-
-}
diff --git a/importer/src/main/java/org/mskcc/cbio/foundation/jaxb/RearrangementsType.java b/importer/src/main/java/org/mskcc/cbio/foundation/jaxb/RearrangementsType.java
deleted file mode 100644
index 4538e6334ca..00000000000
--- a/importer/src/main/java/org/mskcc/cbio/foundation/jaxb/RearrangementsType.java
+++ /dev/null
@@ -1,74 +0,0 @@
-//
-// This file was generated by the JavaTM Architecture for XML Binding(JAXB) Reference Implementation, v2.2.4-2
-// See http://java.sun.com/xml/jaxb
-// Any modifications to this file will be lost upon recompilation of the source schema.
-// Generated on: 2014.08.05 at 10:50:18 AM EDT
-//
-
-
-package org.mskcc.cbio.foundation.jaxb;
-
-import java.util.ArrayList;
-import java.util.List;
-import javax.xml.bind.annotation.XmlAccessType;
-import javax.xml.bind.annotation.XmlAccessorType;
-import javax.xml.bind.annotation.XmlType;
-
-
-/**
- *
Java class for rearrangementsType complex type.
- *
- *
The following schema fragment specifies the expected content contained within this class.
- *
- *
- *
- *
- */
-@XmlAccessorType(XmlAccessType.FIELD)
-@XmlType(name = "rearrangementsType", propOrder = {
- "rearrangement"
-})
-public class RearrangementsType {
-
- protected List rearrangement;
-
- /**
- * Gets the value of the rearrangement property.
- *
- *
- * This accessor method returns a reference to the live list,
- * not a snapshot. Therefore any modification you make to the
- * returned list will be present inside the JAXB object.
- * This is why there is not a set method for the rearrangement property.
- *
- *
- * For example, to add a new item, do as follows:
- *
- * getRearrangement().add(newItem);
- *
- *
- *
- *
- * Objects of the following type(s) are allowed in the list
- * {@link RearrangementType }
- *
- *
- */
- public List getRearrangement() {
- if (rearrangement == null) {
- rearrangement = new ArrayList();
- }
- return this.rearrangement;
- }
-
-}
diff --git a/importer/src/main/java/org/mskcc/cbio/foundation/jaxb/SampleType.java b/importer/src/main/java/org/mskcc/cbio/foundation/jaxb/SampleType.java
deleted file mode 100644
index 1741619f8f4..00000000000
--- a/importer/src/main/java/org/mskcc/cbio/foundation/jaxb/SampleType.java
+++ /dev/null
@@ -1,163 +0,0 @@
-//
-// This file was generated by the JavaTM Architecture for XML Binding(JAXB) Reference Implementation, v2.2.4-2
-// See http://java.sun.com/xml/jaxb
-// Any modifications to this file will be lost upon recompilation of the source schema.
-// Generated on: 2014.08.05 at 10:50:18 AM EDT
-//
-
-
-package org.mskcc.cbio.foundation.jaxb;
-
-import java.io.Serializable;
-import java.util.ArrayList;
-import java.util.List;
-import javax.xml.bind.JAXBElement;
-import javax.xml.bind.annotation.XmlAccessType;
-import javax.xml.bind.annotation.XmlAccessorType;
-import javax.xml.bind.annotation.XmlAttribute;
-import javax.xml.bind.annotation.XmlElementRef;
-import javax.xml.bind.annotation.XmlMixed;
-import javax.xml.bind.annotation.XmlType;
-
-
-/**
- *
Java class for sampleType complex type.
- *
- *
The following schema fragment specifies the expected content contained within this class.
- *
- *
- * This accessor method returns a reference to the live list,
- * not a snapshot. Therefore any modification you make to the
- * returned list will be present inside the JAXB object.
- * This is why there is not a set method for the content property.
- *
- *
- * For example, to add a new item, do as follows:
- *
- * getContent().add(newItem);
- *
- *
- *
- *
- * Objects of the following type(s) are allowed in the list
- * {@link String }
- * {@link JAXBElement }{@code <}{@link String }{@code >}
- *
- *
- */
- public List getContent() {
- if (content == null) {
- content = new ArrayList();
- }
- return this.content;
- }
-
- /**
- * Gets the value of the baitSet property.
- *
- * @return
- * possible object is
- * {@link String }
- *
- */
- public String getBaitSet() {
- return baitSet;
- }
-
- /**
- * Sets the value of the baitSet property.
- *
- * @param value
- * allowed object is
- * {@link String }
- *
- */
- public void setBaitSet(String value) {
- this.baitSet = value;
- }
-
- /**
- * Gets the value of the meanExonDepth property.
- *
- * @return
- * possible object is
- * {@link Float }
- *
- */
- public Float getMeanExonDepth() {
- return meanExonDepth;
- }
-
- /**
- * Sets the value of the meanExonDepth property.
- *
- * @param value
- * allowed object is
- * {@link Float }
- *
- */
- public void setMeanExonDepth(Float value) {
- this.meanExonDepth = value;
- }
-
- /**
- * Gets the value of the name property.
- *
- * @return
- * possible object is
- * {@link String }
- *
- */
- public String getName() {
- return name;
- }
-
- /**
- * Sets the value of the name property.
- *
- * @param value
- * allowed object is
- * {@link String }
- *
- */
- public void setName(String value) {
- this.name = value;
- }
-
-}
diff --git a/importer/src/main/java/org/mskcc/cbio/foundation/jaxb/SamplesType.java b/importer/src/main/java/org/mskcc/cbio/foundation/jaxb/SamplesType.java
deleted file mode 100644
index ce0bb233240..00000000000
--- a/importer/src/main/java/org/mskcc/cbio/foundation/jaxb/SamplesType.java
+++ /dev/null
@@ -1,69 +0,0 @@
-//
-// This file was generated by the JavaTM Architecture for XML Binding(JAXB) Reference Implementation, v2.2.4-2
-// See http://java.sun.com/xml/jaxb
-// Any modifications to this file will be lost upon recompilation of the source schema.
-// Generated on: 2014.08.05 at 10:50:18 AM EDT
-//
-
-
-package org.mskcc.cbio.foundation.jaxb;
-
-import javax.xml.bind.annotation.XmlAccessType;
-import javax.xml.bind.annotation.XmlAccessorType;
-import javax.xml.bind.annotation.XmlElement;
-import javax.xml.bind.annotation.XmlType;
-
-
-/**
- *
Java class for samplesType complex type.
- *
- *
The following schema fragment specifies the expected content contained within this class.
- *
- *
- *
- *
- */
-@XmlAccessorType(XmlAccessType.FIELD)
-@XmlType(name = "short-variantsType", propOrder = {
- "shortVariant"
-})
-public class ShortVariantsType {
-
- @XmlElement(name = "short-variant")
- protected List shortVariant;
-
- /**
- * Gets the value of the shortVariant property.
- *
- *
- * This accessor method returns a reference to the live list,
- * not a snapshot. Therefore any modification you make to the
- * returned list will be present inside the JAXB object.
- * This is why there is not a set method for the shortVariant property.
- *
- *
- * For example, to add a new item, do as follows:
- *
- * getShortVariant().add(newItem);
- *
- *
- *
- *
- * Objects of the following type(s) are allowed in the list
- * {@link ShortVariantType }
- *
- *
- */
- public List getShortVariant() {
- if (shortVariant == null) {
- shortVariant = new ArrayList();
- }
- return this.shortVariant;
- }
-
-}
diff --git a/importer/src/main/java/org/mskcc/cbio/foundation/jaxb/VariantReportType.java b/importer/src/main/java/org/mskcc/cbio/foundation/jaxb/VariantReportType.java
deleted file mode 100644
index 12a82553d12..00000000000
--- a/importer/src/main/java/org/mskcc/cbio/foundation/jaxb/VariantReportType.java
+++ /dev/null
@@ -1,341 +0,0 @@
-//
-// This file was generated by the JavaTM Architecture for XML Binding(JAXB) Reference Implementation, v2.2.4-2
-// See http://java.sun.com/xml/jaxb
-// Any modifications to this file will be lost upon recompilation of the source schema.
-// Generated on: 2014.08.05 at 10:50:18 AM EDT
-//
-
-
-package org.mskcc.cbio.foundation.jaxb;
-
-import java.io.Serializable;
-import java.util.ArrayList;
-import java.util.List;
-import javax.xml.bind.JAXBElement;
-import javax.xml.bind.annotation.XmlAccessType;
-import javax.xml.bind.annotation.XmlAccessorType;
-import javax.xml.bind.annotation.XmlAttribute;
-import javax.xml.bind.annotation.XmlElement;
-import javax.xml.bind.annotation.XmlElementRef;
-import javax.xml.bind.annotation.XmlMixed;
-import javax.xml.bind.annotation.XmlType;
-
-
-/**
- *
Java class for variant-reportType complex type.
- *
- *
The following schema fragment specifies the expected content contained within this class.
- *
- *
- *
- *
- */
- @XmlAccessorType(XmlAccessType.FIELD)
- @XmlType(name = "", propOrder = {
- "content"
- })
- public static class Rearrangements {
-
- @XmlElementRef(name = "rearrangement", type = JAXBElement.class, required = false)
- @XmlMixed
- protected List content;
-
- /**
- * Gets the value of the content property.
- *
- *
- * This accessor method returns a reference to the live list,
- * not a snapshot. Therefore any modification you make to the
- * returned list will be present inside the JAXB object.
- * This is why there is not a set method for the content property.
- *
- *
- * For example, to add a new item, do as follows:
- *
- * getContent().add(newItem);
- *
- *
- *
- *
- * Objects of the following type(s) are allowed in the list
- * {@link String }
- * {@link JAXBElement }{@code <}{@link RearrangementType }{@code >}
- *
- *
- */
- public List getContent() {
- if (content == null) {
- content = new ArrayList();
- }
- return this.content;
- }
-
- }
-
-}
diff --git a/importer/src/main/java/org/mskcc/cbio/importer/Admin.java b/importer/src/main/java/org/mskcc/cbio/importer/Admin.java
deleted file mode 100644
index ffbff78f3d9..00000000000
--- a/importer/src/main/java/org/mskcc/cbio/importer/Admin.java
+++ /dev/null
@@ -1,963 +0,0 @@
-/** Copyright (c) 2012 Memorial Sloan-Kettering Cancer Center.
- *
- * This library is distributed in the hope that it will be useful, but
- * WITHOUT ANY WARRANTY, WITHOUT EVEN THE IMPLIED WARRANTY OF
- * MERCHANTABILITY OR FITNESS FOR A PARTICULAR PURPOSE. The software and
- * documentation provided hereunder is on an "as is" basis, and
- * Memorial Sloan-Kettering Cancer Center
- * has no obligations to provide maintenance, support,
- * updates, enhancements or modifications. In no event shall
- * Memorial Sloan-Kettering Cancer Center
- * be liable to any party for direct, indirect, special,
- * incidental or consequential damages, including lost profits, arising
- * out of the use of this software and its documentation, even if
- * Memorial Sloan-Kettering Cancer Center
- * has been advised of the possibility of such damage.
-*/
-
-// package
-package org.mskcc.cbio.importer;
-
-// imports
-import org.mskcc.cbio.importer.*;
-import org.mskcc.cbio.importer.model.*;
-import org.mskcc.cbio.portal.dao.DaoCancerStudy;
-
-import org.apache.commons.cli.*;
-
-import org.apache.commons.logging.*;
-import org.apache.log4j.PropertyConfigurator;
-
-import org.springframework.context.ApplicationContext;
-import org.springframework.context.support.ClassPathXmlApplicationContext;
-import org.springframework.mail.javamail.JavaMailSender;
-import org.springframework.mail.SimpleMailMessage;
-
-import java.io.*;
-import java.util.*;
-import java.text.SimpleDateFormat;
-
-/**
- * Class which provides command line admin capabilities
- * to the importer tool.
- */
-public class Admin implements Runnable {
-
- // our context file
- public static final String contextFile = "classpath:applicationContext-importer.xml";
-
- // date format
- public static final SimpleDateFormat PORTAL_DATE_FORMAT = new SimpleDateFormat("MM/dd/yyyy");
-
- // context
- private static final ApplicationContext context = new ClassPathXmlApplicationContext(contextFile);
-
- // our logger
- private static final Log LOG = LogFactory.getLog(Admin.class);
-
- // options var
- private static final Options options = initializeOptions();
-
- // identifiers for init db command
- private static final String PORTAL_DATABASE = "portal";
- private static final String IMPORTER_DATABASE = "importer";
-
- private int numStudiesUpdated;
-
- // parsed command line
- private CommandLine commandLine;
-
- /**
- * Method to get beans by id
- *
- * @param String beanID
- * @return Object
- */
- private static Object getBean(String beanID) {
- return context.getBean(beanID);
- }
-
- /**
- * Method to initialize our static options var
- *
- * @return Options
- */
- private static Options initializeOptions() {
-
- // create each option
- Option help = new Option("help", "Print this message.");
-
- Option initializeDatabase = (OptionBuilder.withArgName("db_name")
- .hasArg()
- .withDescription("Initialize database(s). Valid " +
- "database identifiers are: " +
- "\"" + PORTAL_DATABASE + "\" and \"" +
- IMPORTER_DATABASE + "\" or " +
- "\"" + Config.ALL + "\".")
- .create("init_db"));
-
- Option fetchData = (OptionBuilder.withArgName("data_source:run_date:update_worksheet")
- .hasArgs(3)
- .withValueSeparator(':')
- .withDescription("Fetch data from the given data_source and the given run date (mm/dd/yyyy). " +
- "Use \"" + Fetcher.LATEST_RUN_INDICATOR + "\" to retrieve the most current run or " +
- "when fetching clinical data. If fetching is from mercurial via automation, " +
- "if update_worksheet is 't', cancer study entries on the cancer_studies worksheet will be updated " +
- "or added as needed.")
- .create("fetch_data"));
-
- Option fetchReferenceData = (OptionBuilder.withArgName("reference_data")
- .hasArg()
- .withDescription("Fetch the given reference data." +
- " Use \"" + Config.ALL + "\" to retrieve all reference data.")
- .create("fetch_reference_data"));
-
- Option oncotateMAF = (OptionBuilder.withArgName("maf_file")
- .hasArg()
- .withDescription("Run the given MAF though the Oncotator and OMA tools.")
- .create("oncotate_maf"));
-
- Option oncotateAllMAFs = (OptionBuilder.withArgName("data_source")
- .hasArg()
- .withDescription("Run all MAFs in the given datasource though the Oncotator and OMA tools.")
- .create("oncotate_mafs"));
-
- Option convertData = (OptionBuilder.withArgName("portal:run_date:apply_overrides")
- .hasArgs(3)
- .withValueSeparator(':')
- .withDescription("Convert data within the importer database " +
- "from the given run date (mm/dd/yyyy), " +
- "for the given portal. If apply_overrides is 't', " +
- "overrides will be substituted for data_source data " +
- "before staging files are created.")
- .create("convert_data"));
-
- Option applyOverrides = (OptionBuilder.withArgName("portal:exclude_datatype:apply_case_lists")
- .hasArgs(3)
- .withValueSeparator(':')
- .withDescription("Replace staging files for the given portal " +
- "with any exisiting overrides. If exclude_datatype is set, " +
- "the datatype provided will not have overrides applied. If " +
- "apply_case_lists is 'f', case lists will not be copied into staging directory.")
- .create("apply_overrides"));
-
- Option generateCaseLists = (OptionBuilder.withArgName("portal")
- .hasArg()
- .withDescription("Generate case lists for existing " +
- "staging files for the given portal.")
- .create("generate_case_lists"));
-
- Option importReferenceData = (OptionBuilder.withArgName("reference_type")
- .hasArg()
- .withDescription("Import reference data for the given reference_type. "+
- "Use \"" + Config.ALL + "\" to import all reference data.")
- .create("import_reference_data"));
-
- Option importTypesOfCancer = (OptionBuilder.hasArg(false)
- .withDescription("Import types of cancer.")
- .create("import_types_of_cancer"));
-
- Option importData = (OptionBuilder.withArgName("portal:init_portal_db:init_tumor_types:ref_data")
- .hasArgs(4)
- .withValueSeparator(':')
- .withDescription("Import data for the given portal. " +
- "If init_portal_db is 't' a portal db will be created (an existing one will be clobbered. " +
- "If init_tumor_types is 't' tumor types will be imported " +
- "If ref_data is 't', all reference data will be imported prior to importing staging files.")
- .create("import_data"));
-
- Option updateStudyData = (OptionBuilder.withArgName("portal:update_worksheet:send_notification")
- .hasArgs(3)
- .withValueSeparator(':')
- .withDescription("Updates study data for the given portal. if update_worksheet is 't' " +
- "msk_automation_portal entry will be cleared. if send_notification is 't' " +
- "email will be sent to registered users within information about the updates.")
- .create("update_study_data"));
-
- Option importCaseLists = (OptionBuilder.withArgName("portal")
- .hasArgs(1)
- .withDescription("Import case lists for the given portal.")
- .create("import_case_lists"));
-
- Option copySegFiles = (OptionBuilder.withArgName("portal:seg_datatype:remote_user_name")
- .hasArgs(3)
- .withValueSeparator(':')
- .withDescription("Copy's given portal's .seg files to location used for linking to IGV " +
- "from cBio Portal web site. 'ssh-add' should be executed prior to this " +
- "command to add your identity to the authentication agent.")
- .create("copy_seg_files"));
-
- Option redeployWar = (OptionBuilder.withArgName("portal")
- .hasArg()
- .withDescription("Redeploy war for given portal. " +
- "'ssh-add' should be executed prior to this " +
- "command to add your identity to the authentication agent.")
- .create("redeploy_war"));
-
- Option deleteCancerStudy = (OptionBuilder.withArgName("cancer_study_id")
- .hasArg()
- .withDescription("Delete a cancer study matching the given cancer study id.")
- .create("delete_cancer_study"));
-
- // create an options instance
- Options toReturn = new Options();
-
- // add options
- toReturn.addOption(help);
- toReturn.addOption(initializeDatabase);
- toReturn.addOption(fetchData);
- toReturn.addOption(fetchReferenceData);
- toReturn.addOption(oncotateMAF);
- toReturn.addOption(oncotateAllMAFs);
- toReturn.addOption(convertData);
- toReturn.addOption(applyOverrides);
- toReturn.addOption(generateCaseLists);
- toReturn.addOption(importReferenceData);
- toReturn.addOption(importTypesOfCancer);
- toReturn.addOption(importData);
- toReturn.addOption(updateStudyData);
- toReturn.addOption(importCaseLists);
- toReturn.addOption(copySegFiles);
- toReturn.addOption(redeployWar);
- toReturn.addOption(deleteCancerStudy);
-
- // outta here
- return toReturn;
- }
-
- /**
- * Parses the arguments.
- *
- * @param args String[]
- */
- public void setCommandParameters(String[] args) {
-
- // create our parser
- CommandLineParser parser = new PosixParser();
-
- // parse
- try {
- commandLine = parser.parse(options, args);
- }
- catch (Exception e) {
- Admin.usage(new PrintWriter(System.out, true));
- }
- }
-
- public int getNumStudiesUpdated()
- {
- return numStudiesUpdated;
- }
-
- /*
- * Executes the desired portal commmand.
- */
- @Override
- public void run() {
-
- numStudiesUpdated = 0;
-
- // sanity check
- if (commandLine == null) {
- return;
- }
-
- try {
- // usage
- if (commandLine.hasOption("help")) {
- Admin.usage(new PrintWriter(System.out, true));
- }
- // initialize import database
- else if (commandLine.hasOption("init_db")) {
- initializeDatabase(commandLine.getOptionValue("init_db"));
- }
- // fetch
- else if (commandLine.hasOption("fetch_data")) {
- String[] values = commandLine.getOptionValues("fetch_data");
- fetchData(values[0], values[1], (values.length == 3) ? values[2] : "f");
- }
- // fetch reference data
- else if (commandLine.hasOption("fetch_reference_data")) {
- fetchReferenceData(commandLine.getOptionValue("fetch_reference_data"));
- }
- // oncotate MAF
- else if (commandLine.hasOption("oncotate_maf")) {
- oncotateMAF(commandLine.getOptionValue("oncotate_maf"));
- }
- // oncotate MAFs
- else if (commandLine.hasOption("oncotate_mafs")) {
- oncotateAllMAFs(commandLine.getOptionValue("oncotate_mafs"));
- }
- // apply overrides
- else if (commandLine.hasOption("apply_overrides")) {
- String[] values = commandLine.getOptionValues("apply_overrides");
- applyOverrides(values[0], (values.length >= 2) ? values[1] : "", (values.length == 3) ? values[2] : "");
- }
- // convert data
- else if (commandLine.hasOption("convert_data")) {
- String[] values = commandLine.getOptionValues("convert_data");
- convertData(values[0], values[1], (values.length == 3) ? values[2] : "");
- }
- // generate case lists
- else if (commandLine.hasOption("generate_case_lists")) {
- generateCaseLists(commandLine.getOptionValue("generate_case_lists"));
- }
- // import reference data
- else if (commandLine.hasOption("import_reference_data")) {
- importReferenceData(commandLine.getOptionValue("import_reference_data"));
- }
- else if (commandLine.hasOption("import_types_of_cancer")) {
- importTypesOfCancer();
- }
- // import data
- else if (commandLine.hasOption("import_data")) {
- String[] values = commandLine.getOptionValues("import_data");
- importData(values[0], values[1], values[2], values[3]);
- }
- else if (commandLine.hasOption("update_study_data")) {
- String[] values = commandLine.getOptionValues("update_study_data");
- numStudiesUpdated = updateStudyData(values[0], values[1], values[2]);
- }
-
- // import case lists
- else if (commandLine.hasOption("import_case_lists")) {
- String[] values = commandLine.getOptionValues("import_case_lists");
- importCaseLists(values[0]);
- }
- // copy seg files
- else if (commandLine.hasOption("copy_seg_files")) {
- String[] values = commandLine.getOptionValues("copy_seg_files");
- copySegFiles(values[0], values[1], values[2]);
- }
- // redeploy war
- else if (commandLine.hasOption("redeploy_war")) {
- redeployWar(commandLine.getOptionValue("redeploy_war"));
- }
- else if (commandLine.hasOption("delete_cancer_study")) {
- deleteCancerStudy(commandLine.getOptionValue("delete_cancer_study"));
- }
- else {
- Admin.usage(new PrintWriter(System.out, true));
- }
- }
- catch (Exception e) {
- e.printStackTrace();
- }
- }
-
- /**
- * Helper function to initialize import database.
- *
- * @param databaseName String
- * @throws Exception
- */
- private void initializeDatabase(String databaseName) throws Exception {
-
- if (LOG.isInfoEnabled()) {
- LOG.info("initializeDatabase(): " + databaseName);
- }
-
- boolean unknownDB = true;
- DatabaseUtils databaseUtils = (DatabaseUtils)getBean("databaseUtils");
- if (databaseName.equals(Config.ALL) || databaseName.equals(IMPORTER_DATABASE)) {
- unknownDB = false;
- databaseUtils.createDatabase(databaseUtils.getImporterDatabaseName(), true);
- }
- if (databaseName.equals(Config.ALL) || databaseName.equals(PORTAL_DATABASE)) {
- unknownDB = false;
- databaseUtils.createDatabase(databaseUtils.getPortalDatabaseName(), false);
- boolean success = databaseUtils.executeScript(databaseUtils.getPortalDatabaseName(),
- databaseUtils.getPortalDatabaseSchema(),
- databaseUtils.getDatabaseUser(),
- databaseUtils.getDatabasePassword());
- if (!success) {
- System.err.println("Error creating database schema.");
- }
- }
- if (unknownDB && LOG.isInfoEnabled()) {
- LOG.info("initializeDatabase(), unknown database: " + databaseName);
- }
-
- if (LOG.isInfoEnabled()) {
- LOG.info("initializeDatabase(), complete");
- }
- }
-
- /**
- * Helper function to get data.
- *
- * @param dataSource String
- * @param runDate String
- * @throws Exception
- */
- private void fetchData(String dataSource, String runDate, String updateWorksheet) throws Exception {
-
- if (LOG.isInfoEnabled()) {
- LOG.info("fetchData(), dateSource:runDate: " + dataSource + ":" + runDate);
- }
- Boolean updateWorksheetBool = getBoolean(updateWorksheet);
-
- // create an instance of fetcher
- DataSourcesMetadata dataSourcesMetadata = getDataSourcesMetadata(dataSource);
- // fetch the given data source
- Fetcher fetcher = (Fetcher)getBean(dataSourcesMetadata.getFetcherBeanID());
- fetcher.fetch(dataSource, runDate, updateWorksheetBool);
-
- if (LOG.isInfoEnabled()) {
- LOG.info("fetchData(), complete");
- }
- }
-
- /**
- * Helper function to fetch reference data.
- *
- * @param referenceType String
- *
- * @throws Exception
- */
- private void fetchReferenceData(String referenceType) throws Exception {
-
- if (LOG.isInfoEnabled()) {
- LOG.info("fetchReferenceData(), referenceType: " + referenceType);
- }
-
- // create an instance of fetcher
- Config config = (Config)getBean("config");
- Collection referenceMetadatas = config.getReferenceMetadata(referenceType);
- if (referenceMetadatas.isEmpty()) {
- if (LOG.isInfoEnabled()) {
- LOG.info("fetchReferenceData(), unknown referenceType: " + referenceType);
- }
- }
- else {
- Fetcher fetcher = (Fetcher)getBean("referenceDataFetcher");
- for (ReferenceMetadata referenceMetadata : referenceMetadatas) {
- if ((referenceType.equals(Config.ALL) && referenceMetadata.getFetch())
- || referenceMetadata.getReferenceType().equals(referenceType)) {
- if (LOG.isInfoEnabled()) {
- LOG.info("fetchReferenceData(), calling fetcher for: " + referenceMetadata.getReferenceType());
- }
- fetcher.fetchReferenceData(referenceMetadata);
- }
- }
- }
- if (LOG.isInfoEnabled()) {
- LOG.info("fetchReferenceData(), complete");
- }
- }
-
- /**
- * Helper function to oncotate the give MAF.
- *
- * @param mafFile String
- *
- * @throws Exception
- */
- private void oncotateMAF(String mafFileName) throws Exception {
-
- if (LOG.isInfoEnabled()) {
- LOG.info("oncotateMAF(), mafFile: " + mafFileName);
- }
-
- // sanity check
- File mafFile = new File(mafFileName);
- if (!mafFile.exists()) {
- throw new IllegalArgumentException("cannot find the give MAF: " + mafFileName);
- }
-
- // create fileUtils object
- Config config = (Config)getBean("config");
- FileUtils fileUtils = (FileUtils)getBean("fileUtils");
-
- // create tmp file for given MAF
- File tmpMAF =
- org.apache.commons.io.FileUtils.getFile(org.apache.commons.io.FileUtils.getTempDirectory(),
- ""+System.currentTimeMillis()+".tmpMAF");
- org.apache.commons.io.FileUtils.copyFile(mafFile, tmpMAF);
-
- // oncotate the MAF (input is tmp maf, output is original maf)
- fileUtils.oncotateMAF(FileUtils.FILE_URL_PREFIX + tmpMAF.getCanonicalPath(),
- FileUtils.FILE_URL_PREFIX + mafFile.getCanonicalPath());
-
- // clean up
- if (tmpMAF.exists()) {
- org.apache.commons.io.FileUtils.forceDelete(tmpMAF);
- }
-
- if (LOG.isInfoEnabled()) {
- LOG.info("oncotateMAF(), complete");
- }
- }
-
- /**
- * Helper function to oncotate MAFs.
- *
- * @param dataSource String
- *
- * @throws Exception
- */
- private void oncotateAllMAFs(String dataSource) throws Exception {
-
- if (LOG.isInfoEnabled()) {
- LOG.info("oncotateAllMAFs(), dataSource: " + dataSource);
- }
-
- // get the data source metadata object
- DataSourcesMetadata dataSourcesMetadata = getDataSourcesMetadata(dataSource);
-
- // oncotate all the files of the given data source
- FileUtils fileUtils = (FileUtils)getBean("fileUtils");
- fileUtils.oncotateAllMAFs(dataSourcesMetadata);
-
- if (LOG.isInfoEnabled()) {
- LOG.info("oncotateAllMAFs(), complete");
- }
- }
-
- /**
- * Helper function to convert data.
- *
- * @param portal String
- * @param runDate String
- * @param applyOverrides String
- *
- * @throws Exception
- */
- private void convertData(String portal, String runDate, String applyOverrides) throws Exception {
-
- if (LOG.isInfoEnabled()) {
- LOG.info("convertData(), portal: " + portal);
- LOG.info("convertData(), run date: " + runDate);
- LOG.info("convertData(), apply overrides: " + applyOverrides);
- }
-
- Boolean applyOverridesBool = getBoolean(applyOverrides);
-
- // sanity check date format - doesn't work?
- PORTAL_DATE_FORMAT.setLenient(false);
- PORTAL_DATE_FORMAT.parse(runDate);
-
- // create an instance of Converter
- Converter converter = (Converter)getBean("converter");
- converter.convertData(portal, runDate, applyOverridesBool);
-
- if (LOG.isInfoEnabled()) {
- LOG.info("convertData(), complete");
- }
- }
-
- /**
- * Helper function to apply overrides to a given portal.
- *
- * @param portal String
- * @param excludeDatatype String
- * @param applyCaseLists String
- * @throws Exception
- */
- private void applyOverrides(String portal, String excludeDatatype, String applyCaseLists) throws Exception {
-
- if (LOG.isInfoEnabled()) {
- LOG.info("applyOverrides(), portal: " + portal);
- LOG.info("applyOverrides(), exclude_datatype: " + excludeDatatype);
- LOG.info("applyOverrides(), apply_case_lists: " + applyCaseLists);
- }
-
- Converter converter = (Converter)getBean("converter");
- HashSet excludeDatatypes = new HashSet();
- if (excludeDatatype.length() > 0) excludeDatatypes.add(excludeDatatype);
- Boolean applyCaseListsBool = getBoolean(applyCaseLists);
- converter.applyOverrides(portal, excludeDatatypes, applyCaseListsBool);
-
- if (LOG.isInfoEnabled()) {
- LOG.info("applyOverrides(), complete");
- }
- }
-
- /**
- * Helper function to generate case lists.
- *
- * @param portal String
- *
- * @throws Exception
- */
- private void generateCaseLists(String portal) throws Exception {
-
- if (LOG.isInfoEnabled()) {
- LOG.info("generateCaseLists(), portal: " + portal);
- }
-
- // create an instance of Converter
- Converter converter = (Converter)getBean("converter");
- converter.generateCaseLists(portal);
-
- if (LOG.isInfoEnabled()) {
- LOG.info("generateCaseLists(), complete");
- }
- }
-
- /**
- * Helper function to import reference data.
- *
- * @param referenceType String
- *
- * @throws Exception
- */
- private void importReferenceData(String referenceType) throws Exception {
-
- if (LOG.isInfoEnabled()) {
- LOG.info("importReferenceData(), referenceType: " + referenceType);
- }
-
- // create an instance of Importer
- Config config = (Config)getBean("config");
- Collection referenceMetadatas = config.getReferenceMetadata(referenceType);
- if (referenceMetadatas.isEmpty()) {
- if (LOG.isInfoEnabled()) {
- LOG.info("importReferenceData(), unknown referenceType: " + referenceType);
- }
- }
- else {
- Importer importer = (Importer)getBean("importer");
- for (ReferenceMetadata referenceMetadata : referenceMetadatas) {
- if ((referenceType.equals(Config.ALL) && referenceMetadata.getImport()) ||
- referenceMetadata.getReferenceType().equals(referenceType)) {
- if (LOG.isInfoEnabled()) {
- LOG.info("importReferenceData(), calling import for: " + referenceMetadata.getReferenceType());
- }
- importer.importReferenceData(referenceMetadata);
- }
- }
- }
- if (LOG.isInfoEnabled()) {
- LOG.info("importReferenceData(), complete");
- }
- }
-
- /**
- * Helper function to import types of cancer.
- *
- * @param referenceType String
- *
- * @throws Exception
- */
- private void importTypesOfCancer() throws Exception {
-
- if (LOG.isInfoEnabled()) {
- LOG.info("importTypesOfCancer()");
- }
-
- Importer importer = (Importer)getBean("importer");
- importer.importTypesOfCancer();
-
- if (LOG.isInfoEnabled()) {
- LOG.info("importReferenceData(), complete");
- }
- }
-
- /**
- * Helper function to import data.
- *
- * @param portal String
- * @param initPortalDatabase String
- * @param initTumorTypes String
- * @param importReferenceData String
- *
- * @throws Exception
- */
- private void importData(String portal, String initPortalDatabase, String initTumorTypes, String importReferenceData) throws Exception {
-
- if (LOG.isInfoEnabled()) {
- LOG.info("importData(), portal: " + portal);
- LOG.info("importData(), initPortalDatabase: " + initPortalDatabase);
- LOG.info("importData(), initTumorTypes: " + initTumorTypes);
- LOG.info("importData(), importReferenceData: " + importReferenceData);
- }
-
- // get booleans
- Boolean initPortalDatabaseBool = getBoolean(initPortalDatabase);
- Boolean initTumorTypesBool = getBoolean(initTumorTypes);
- Boolean importReferenceDataBool = getBoolean(importReferenceData);
-
- // create an instance of Importer
- Importer importer = (Importer)getBean("importer");
- importer.importData(portal, initPortalDatabaseBool, initTumorTypesBool, importReferenceDataBool);
-
- if (LOG.isInfoEnabled()) {
- LOG.info("importData(), complete");
- }
- }
-
- private int updateStudyData(String portal, String updateWorksheet, String sendNotification) throws Exception
- {
- if (LOG.isInfoEnabled()) {
- LOG.info("updateStudyData(), portal: " + portal);
- LOG.info("updateStudyData(), update_worksheet: " + updateWorksheet);
- }
- Boolean updateWorksheetBool = getBoolean(updateWorksheet);
- Boolean sendNotificationBool = getBoolean(sendNotification);
-
- Config config = (Config)getBean("config");
- Importer importer = (Importer)getBean("importer");
-
- Map propertyMap = new HashMap();
- propertyMap.put(CancerStudyMetadata.MSK_PORTAL_COLUMN_KEY, "");
-
- List cancerStudiesUpdated = new ArrayList();
- List cancerStudiesRemoved = new ArrayList();
-
- Collection cancerStudyMetadataToImport = config.getCancerStudyMetadata(portal);
- for (CancerStudyMetadata cancerStudyMetadata : config.getAllCancerStudyMetadata()) {
- if (portal.equals(PortalMetadata.TRIAGE_PORTAL)) {
- if (cancerStudyMetadataToImport.contains(cancerStudyMetadata)) {
- if (!DaoCancerStudy.doesCancerStudyExistByStableId(cancerStudyMetadata.getStableId())) {
- // update/add study into db
- try {
- importer.updateCancerStudy(portal, cancerStudyMetadata);
- cancerStudiesUpdated.add(cancerStudyMetadata.getStudyPath());
- }
- catch (Exception e) {
- LOG.info(e.getMessage());
- LOG.info("Error updating study: " + cancerStudyMetadata.getStableId() + ", skipping.");
- }
- }
- }
- else {
- // remove from db
- if (deleteCancerStudy(cancerStudyMetadata.getStableId())) {
- cancerStudiesRemoved.add(cancerStudyMetadata.getStudyPath());
- }
- }
- }
- else if (cancerStudyMetadataToImport.contains(cancerStudyMetadata)) {
- importer.updateCancerStudy(portal, cancerStudyMetadata);
- cancerStudiesUpdated.add(cancerStudyMetadata.getStudyPath());
- }
- if (portal.equals(PortalMetadata.MSK_AUTOMATION_PORTAL) && updateWorksheetBool
- && cancerStudyMetadataToImport.contains(cancerStudyMetadata)) {
- // For BIC, we do not want to update production again unless a new update occurs.
- // For DMP we will, so we need option to clear msk_automation_portal flag
- config.updateCancerStudyAttributes(cancerStudyMetadata.getStudyPath(), propertyMap);
- }
- }
- if (sendNotificationBool && (!cancerStudiesUpdated.isEmpty() || !cancerStudiesRemoved.isEmpty())) {
- sendNotification(portal, cancerStudiesUpdated, cancerStudiesRemoved);
- }
-
- return cancerStudiesUpdated.size() + cancerStudiesRemoved.size();
- }
-
-
-
- private void sendNotification(String portal, List cancerStudiesUpdated, List cancerStudiesRemoved)
- {
- Config config = (Config)getBean("config");
- SimpleMailMessage message = null;
- if (portal.equals(CancerStudyMetadata.MSK_PORTAL_COLUMN_KEY)) {
- message = (SimpleMailMessage)getBean("mskUpdateMessage");
- }
- else if (portal.equals(CancerStudyMetadata.TRIAGE_PORTAL_COLUMN_KEY)) {
- message = (SimpleMailMessage)getBean("triageUpdateMessage");
- }
- String body = message.getText() + "\n\n";
- SimpleMailMessage msg = new SimpleMailMessage(message);
- for (String cancerStudy : cancerStudiesUpdated) {
- CancerStudyMetadata cancerStudyMetadata = config.getCancerStudyMetadataByName(cancerStudy);
- body += cancerStudyMetadata.getStableId() + "\n";
- }
- if (!cancerStudiesRemoved.isEmpty()) {
- body += "\n\n" + "The following studies have been removed:\n\n";
- for (String cancerStudy : cancerStudiesRemoved) {
- CancerStudyMetadata cancerStudyMetadata = config.getCancerStudyMetadataByName(cancerStudy);
- body += cancerStudyMetadata.getStableId() + "\n";
- }
- }
- msg.setText(body);
- try {
- JavaMailSender mailSender = (JavaMailSender)getBean("mailSender");
- mailSender.send(msg);
- }
- catch (Exception e) {
- LOG.info("sendNotification(), error sending email notification:\n" + e.getMessage());
- }
- }
-
- /**
- *
- * @param portal
- * @throws Exception
- */
- private void importCaseLists(String portal) throws Exception {
- if (LOG.isInfoEnabled()) {
- LOG.info("importData(), portal: " + portal);
- }
-
- // create an instance of Importer
- Importer importer = (Importer)getBean("importer");
- importer.importCaseLists(portal);
-
- if (LOG.isInfoEnabled()) {
- LOG.info("importCaseLists(), complete");
- }
- }
-
- /**
- * Helper function to copy seg files for IGV linking.
- *
- * @param portalName String
- * @param segDatatype String
- * @param removeUserName String
- *
- * @throws Exception
- */
- private void copySegFiles(String portalName, String segDatatype, String remoteUserName) throws Exception {
-
- if (LOG.isInfoEnabled()) {
- LOG.info("copySegFiles(), portal: " + portalName);
- LOG.info("copySegFiles(), segDatatype: " + segDatatype);
- LOG.info("copySegFiles(), remoteUserName: " + remoteUserName);
- }
-
- Config config = (Config)getBean("config");
- Collection portalMetadatas = config.getPortalMetadata(portalName);
- Collection datatypeMetadatas = config.getDatatypeMetadata(segDatatype);
-
- // sanity check args
- if (remoteUserName.length() == 0 || portalMetadatas.isEmpty() || datatypeMetadatas.isEmpty()) {
- if (LOG.isInfoEnabled()) {
- LOG.info("copySegFiles(), error processing arguments, aborting....");
- }
- }
- else {
- // create an instance of Importer
- FileUtils fileUtils = (FileUtils)getBean("fileUtils");
- fileUtils.copySegFiles(portalMetadatas.iterator().next(),
- datatypeMetadatas.iterator().next(),
- remoteUserName);
- }
-
- if (LOG.isInfoEnabled()) {
- LOG.info("copySegFiles(), complete");
- }
- }
-
- private void redeployWar(String portalName) throws Exception
- {
- if (LOG.isInfoEnabled()) {
- LOG.info("redeployWar(), portal: " + portalName);
- }
-
- Config config = (Config)getBean("config");
- Collection portalMetadatas = config.getPortalMetadata(portalName);
-
- // sanity check args
- if (portalMetadatas.isEmpty()) {
- if (LOG.isInfoEnabled()) {
- LOG.info("redeployWar(), error processing argument, aborting....");
- }
- }
- else {
- // create an instance of Importer
- FileUtils fileUtils = (FileUtils)getBean("fileUtils");
- fileUtils.redeployWar(portalMetadatas.iterator().next());
- }
-
- if (LOG.isInfoEnabled()) {
- LOG.info("redeployWar(), complete");
- }
- }
-
- private boolean deleteCancerStudy(String cancerStudyStableId) throws Exception
- {
- if (LOG.isInfoEnabled()) {
- LOG.info("deleteCancerStudy(), study id: " + cancerStudyStableId);
- }
- if (DaoCancerStudy.doesCancerStudyExistByStableId(cancerStudyStableId)) {
- DaoCancerStudy.deleteCancerStudy(cancerStudyStableId);
- if (LOG.isInfoEnabled()) {
- LOG.info("deleteCancerStudy(), complete");
- }
- return true;
- }
- return false;
- }
-
- /**
- * Helper function to get a DataSourcesMetadata from
- * a given datasource (name).
- *
- * @param dataSource String
- * @return DataSourcesMetadata
- */
- private DataSourcesMetadata getDataSourcesMetadata(String dataSource) {
-
- DataSourcesMetadata toReturn = null;
- Config config = (Config)getBean("config");
- Collection dataSources = config.getDataSourcesMetadata(dataSource);
- if (!dataSources.isEmpty()) {
- toReturn = dataSources.iterator().next();
- }
-
- // sanity check
- if (toReturn == null) {
- throw new IllegalArgumentException("cannot instantiate a proper DataSourcesMetadata object.");
- }
-
- // outta here
- return toReturn;
- }
-
- /**
- * Helper function to create boolean based on argument parameter.
- *
- * @param parameterValue String
- * @return Boolean
- */
- private Boolean getBoolean(String parameterValue) {
- if (parameterValue.length() == 0) return new Boolean("false");
- return (parameterValue.equalsIgnoreCase("t")) ? new Boolean("true") : new Boolean("false");
- }
-
- /**
- * Helper function - prints usage
- */
- public static void usage(PrintWriter writer) {
-
- HelpFormatter formatter = new HelpFormatter();
- formatter.printHelp(writer, HelpFormatter.DEFAULT_WIDTH,
- "Admin", "", options,
- HelpFormatter.DEFAULT_LEFT_PAD,
- HelpFormatter.DEFAULT_DESC_PAD, "");
- }
-
- /**
- * The big deal main.
- *
- * @param args String[]
- */
- public static void main(String[] args) throws Exception {
-
- // sanity check
- if (args.length == 0) {
- System.err.println("Missing args to Admin.");
- Admin.usage(new PrintWriter(System.err, true));
- return;
- }
-
- // configure logging
- Properties props = new Properties();
- props.load(Admin.class.getResourceAsStream("/log4j.properties"));
- PropertyConfigurator.configure(props);
-
- // process
- Admin admin = new Admin();
- try {
- admin.setCommandParameters(args);
- admin.run();
- }
- catch (Exception e) {
- e.printStackTrace();
- }
-
- System.exit(admin.getNumStudiesUpdated());
- }
-}
diff --git a/importer/src/main/java/org/mskcc/cbio/importer/CaseIDs.java b/importer/src/main/java/org/mskcc/cbio/importer/CaseIDs.java
deleted file mode 100644
index 60bd9e2619b..00000000000
--- a/importer/src/main/java/org/mskcc/cbio/importer/CaseIDs.java
+++ /dev/null
@@ -1,38 +0,0 @@
-/** Copyright (c) 2012 Memorial Sloan-Kettering Cancer Center.
- *
- * This library is distributed in the hope that it will be useful, but
- * WITHOUT ANY WARRANTY, WITHOUT EVEN THE IMPLIED WARRANTY OF
- * MERCHANTABILITY OR FITNESS FOR A PARTICULAR PURPOSE. The software and
- * documentation provided hereunder is on an "as is" basis, and
- * Memorial Sloan-Kettering Cancer Center
- * has no obligations to provide maintenance, support,
- * updates, enhancements or modifications. In no event shall
- * Memorial Sloan-Kettering Cancer Center
- * be liable to any party for direct, indirect, special,
- * incidental or consequential damages, including lost profits, arising
- * out of the use of this software and its documentation, even if
- * Memorial Sloan-Kettering Cancer Center
- * has been advised of the possibility of such damage.
-*/
-
-// package
-package org.mskcc.cbio.importer;
-
-// imports
-import org.mskcc.cbio.importer.model.DataMatrix;
-
-import java.util.Collection;
-
-/**
- * Interface used to manage case ids within import data matrices.
- */
-public interface CaseIDs {
- boolean isSampleId(String caseId);
- boolean isSampleId(int cancerStudyId, String caseId);
- boolean isNormalId(String caseId);
- boolean isTruncatedTCGAPatientId(String caseId);
- String getSampleId(String caseId);
- String getSampleId(int cancerStudyId, String caseId);
- String getPatientId(String caseId);
- String getPatientId(int cancerStudyId, String caseId);
-}
diff --git a/importer/src/main/java/org/mskcc/cbio/importer/Config.java b/importer/src/main/java/org/mskcc/cbio/importer/Config.java
deleted file mode 100644
index f970877b01a..00000000000
--- a/importer/src/main/java/org/mskcc/cbio/importer/Config.java
+++ /dev/null
@@ -1,213 +0,0 @@
-/** Copyright (c) 2012 Memorial Sloan-Kettering Cancer Center.
- *
- * This library is distributed in the hope that it will be useful, but
- * WITHOUT ANY WARRANTY, WITHOUT EVEN THE IMPLIED WARRANTY OF
- * MERCHANTABILITY OR FITNESS FOR A PARTICULAR PURPOSE. The software and
- * documentation provided hereunder is on an "as is" basis, and
- * Memorial Sloan-Kettering Cancer Center
- * has no obligations to provide maintenance, support,
- * updates, enhancements or modifications. In no event shall
- * Memorial Sloan-Kettering Cancer Center
- * be liable to any party for direct, indirect, special,
- * incidental or consequential damages, including lost profits, arising
- * out of the use of this software and its documentation, even if
- * Memorial Sloan-Kettering Cancer Center
- * has been advised of the possibility of such damage.
-*/
-
-// package
-package org.mskcc.cbio.importer;
-
-// imports
-import java.util.Collection;
-import java.util.List;
-import java.util.Map;
-import org.mskcc.cbio.importer.model.*;
-
-/**
- * Interface used to get/set configuration properties.
- */
-public interface Config {
-
- // const used when requesting all of something
- public static final String ALL = "all";
-
- /**
- * Gets a TumorTypeMetadata object via tumorType.
- * If tumorType == Config.ALL, all are returned.
- *
- * @param tumortype String
- * @return Collection
- */
- Collection getTumorTypeMetadata(String tumorType);
- public TCGATumorTypeMetadata getTCGATumorTypeMetadata(String oncotreeCode);
-
- /**
- * Function to get tumor types to download as String[]
- *
- * @return String[]
- */
- String[] getTumorTypesToDownload();
-
- /**
- * Gets a DatatypeMetadata object for the given datatype name.
- * If datatype == Config.ALL, all are returned.
- *
- * @param datatype String
- * @return Collection
- */
- Collection getDatatypeMetadata(String datatype);
-
- /**
- * Gets a collection of Datatype names for the given portal/cancer study.
- *
- * @param portalMetadata PortalMetadata
- * @param cancerStudyMetadata CancerStudyMetadata
- * @return Collection
- */
- Collection getDatatypeMetadata(PortalMetadata portalMetadata, CancerStudyMetadata cancerStudyMetadata);
-
- /**
- * Function to get datatypes to download as String[].
- *
- * @param dataSourcesMetadata DataSourcesMetadata
- * @return String[]
- * @throws Exception
- */
- String[] getDatatypesToDownload(DataSourcesMetadata dataSourcesMetadata) throws Exception;
-
- /**
- * Function to determine the datatype(s)
- * of the datasource file (the file that was fetched from a datasource).
- *
- * @param dataSourcesMetadata DataSourcesMetadata
- * @param filename String
- * @return Collection
- * @throws Exception
- */
- Collection getFileDatatype(DataSourcesMetadata dataSourcesMetadata, String filename) throws Exception;
-
- /**
- * Gets a collection of CaseIDFilterMetadata.
- * If filterName == Config.ALL, all are returned.
- *
- * @param filterName String
- * @return Collection
- */
- Collection getCaseIDFilterMetadata(String filterName);
-
- /**
- * Gets a collection of CaseListMetadata.
- * If caseListFilename == Config.ALL, all are returned.
- *
- * @param caseListFilename String
- * @return Collection
- */
- Collection getCaseListMetadata(String caseListFilename);
-
- /**
- * Gets a collection of ClinicalAttributesNamespace.
- * If clinicalAttributeNamespaceColumnHeader == Config.ALL, all are returned.
- *
- * @param clinicalAttributeNamespaceColumnHeader String
- * @return Collection
- */
- Collection getClinicalAttributesNamespace(String clinicalAttributesNamespaceColumnHeader);
-
- /**
- * Gets a collection of ClinicalAttributesMetadata.
- * If clinicalAttributeColumnHeader == Config.ALL, all are returned.
- *
- * @param clinicalAttributeColumnHeader String
- * @return Collection
- */
- Collection getClinicalAttributesMetadata(String clinicalAttributeColumnHeader);
-
- /**
- * Gets a map of ClinicalAttributesMetadata (external column header key, metadata object value) given
- * a collection of "external" column header values (column headers from incoming datafiles).
- *
- * @param Collection
- * @return Map
- */
- Map getClinicalAttributesMetadata(Collection externalColumnHeaders);
-
- /**
- * Imports the given collection of bcrs if they are unknown.
- *
- * @param Collection bcrs
- */
- void importBCRClinicalAttributes(Collection bcrs);
-
- void flagMissingClinicalAttributes(String cancerStudy, String tumorType, Collection missingAttributeColumnHeaders);
-
- /**
- * Gets a PortalMetadata object given a portal name.
- * If portalName == Config.ALL, all are returned.
- *
- * @param portalName String
- * @return Collection
- */
- Collection getPortalMetadata(String portalName);
-
- /**
- * Gets ReferenceMetadata for the given referenceType.
- * If referenceType == Config.ALL, all are returned.
- *
- * @param referenceType String
- * @return Collection
- */
- Collection getReferenceMetadata(String referenceType);
-
- /**
- * Gets DataSourcesMetadata for the given dataSource. If dataSource == Config.ALL,
- * all are returned.
- *
- * @param dataSource String
- * @return Collection
- */
- Collection getDataSourcesMetadata(String dataSource);
-
- Collection getAllCancerStudyMetadata();
-
- /**
- * Gets all the cancer studies for a given portal.
- *
- * @param portal String
- * @return Collection
- */
- Collection getCancerStudyMetadata(String portalName);
-
- /**
- * Gets a CancerStudyMetadata for the given cancer study.
- *
- * @param cancerStudyName String - fully qualified path as entered on worksheet, e.g.: prad/mskcc/foundation
- * @return CancerStudyMetadata or null if not found
- */
- CancerStudyMetadata getCancerStudyMetadataByName(String cancerStudyName);
-
- /**
- * Gets FoundationMetadata.
- *
- * @return Collection
- */
- Collection getFoundationMetadata();
-
- /**
- * return the collection of icgc metadata objects
- */
- Collection getIcgcMetadata();
-
- /**
- * Returns a list of cancer study names which incorporate the supplied
- * substring. Comparisons are standardized to
- * lower case. An empty list is returned if no matches are found.
- * @param organizationName
- * @return
- */
- List findCancerStudiesBySubstring(String substring);
-
-
- void updateCancerStudyAttributes(String cancerStudy, Map properties);
- void insertCancerStudyAttributes(Map properties);
-}
diff --git a/importer/src/main/java/org/mskcc/cbio/importer/Converter.java b/importer/src/main/java/org/mskcc/cbio/importer/Converter.java
deleted file mode 100644
index 437a1b31229..00000000000
--- a/importer/src/main/java/org/mskcc/cbio/importer/Converter.java
+++ /dev/null
@@ -1,84 +0,0 @@
-/** Copyright (c) 2012 Memorial Sloan-Kettering Cancer Center.
- *
- * This library is distributed in the hope that it will be useful, but
- * WITHOUT ANY WARRANTY, WITHOUT EVEN THE IMPLIED WARRANTY OF
- * MERCHANTABILITY OR FITNESS FOR A PARTICULAR PURPOSE. The software and
- * documentation provided hereunder is on an "as is" basis, and
- * Memorial Sloan-Kettering Cancer Center
- * has no obligations to provide maintenance, support,
- * updates, enhancements or modifications. In no event shall
- * Memorial Sloan-Kettering Cancer Center
- * be liable to any party for direct, indirect, special,
- * incidental or consequential damages, including lost profits, arising
- * out of the use of this software and its documentation, even if
- * Memorial Sloan-Kettering Cancer Center
- * has been advised of the possibility of such damage.
-*/
-
-// package
-package org.mskcc.cbio.importer;
-
-// imports
-import java.util.Arrays;
-import java.util.HashSet;
-import org.mskcc.cbio.importer.model.DataMatrix;
-import org.mskcc.cbio.importer.model.PortalMetadata;
-import org.mskcc.cbio.importer.model.DatatypeMetadata;
-import org.mskcc.cbio.importer.model.CancerStudyMetadata;
-
-import java.util.Set;
-
-/**
- * Interface used to convert portal data.
- */
-public interface Converter {
-
- public static final String VALUE_DELIMITER = "\t";
- public static final String GENE_ID_COLUMN_HEADER_NAME = "Entrez_Gene_Id";
- public static final String GENE_SYMBOL_COLUMN_HEADER_NAME = "Hugo_Symbol";
- public static final String MUTATION_CASE_ID_COLUMN_HEADER = "Tumor_Sample_Barcode";
- public static final String MUTATION_CASE_LIST_META_HEADER = "sequenced_samples";
- public static final Set NON_CASE_IDS = new HashSet(
- Arrays.asList("MIRNA", "LOCUS", "ID", "GENE SYMBOL", "ENTREZ_GENE_ID", "HUGO_SYMBOL", "LOCUS ID", "CYTOBAND", "COMPOSITE.ELEMENT.REF", "HYBRIDIZATION REF"));
-
- /**
- * Converts data for the given portal.
- *
- * @param portal String
- * @param runDate String
- * @param applyOverrides Boolean
- * @throws Exception
- */
- void convertData(String portal, String runDate, Boolean applyOverrides) throws Exception;
-
- /**
- * Generates case lists for the given portal.
- *
- * @param portal String
- * @throws Exception
- */
- void generateCaseLists(String portal) throws Exception;
-
- /**
- * Applies overrides to the given portal using the given data source.
- * Any datatypes within the excludes datatypes set will not have be overridden.
- *
- * @param portal String
- * @param excludeDatatypes Set
- * @param applyCaseLists boolean
- * @throws Exception
- */
- void applyOverrides(String portal, Set excludeDatatypes, boolean applyCaseLists) throws Exception;
-
- /**
- * Creates a staging file from the given import data.
- *
- * @param portalMetadata PortalMetadata
- * @param cancerStudy CancerStudyMetadata
- * @param datatypeMetadata DatatypeMetadata
- * @param dataMatrices DataMatrix[]
- * @throws Exception
- */
- void createStagingFile(PortalMetadata portalMetadata, CancerStudyMetadata cancerStudyMetadata,
- DatatypeMetadata datatypeMetadata, DataMatrix[] dataMatrices) throws Exception;
-}
diff --git a/importer/src/main/java/org/mskcc/cbio/importer/DatabaseUtils.java b/importer/src/main/java/org/mskcc/cbio/importer/DatabaseUtils.java
deleted file mode 100644
index 10a5f28a517..00000000000
--- a/importer/src/main/java/org/mskcc/cbio/importer/DatabaseUtils.java
+++ /dev/null
@@ -1,89 +0,0 @@
-/** Copyright (c) 2012 Memorial Sloan-Kettering Cancer Center.
- *
- * This library is distributed in the hope that it will be useful, but
- * WITHOUT ANY WARRANTY, WITHOUT EVEN THE IMPLIED WARRANTY OF
- * MERCHANTABILITY OR FITNESS FOR A PARTICULAR PURPOSE. The software and
- * documentation provided hereunder is on an "as is" basis, and
- * Memorial Sloan-Kettering Cancer Center
- * has no obligations to provide maintenance, support,
- * updates, enhancements or modifications. In no event shall
- * Memorial Sloan-Kettering Cancer Center
- * be liable to any party for direct, indirect, special,
- * incidental or consequential damages, including lost profits, arising
- * out of the use of this software and its documentation, even if
- * Memorial Sloan-Kettering Cancer Center
- * has been advised of the possibility of such damage.
-*/
-
-// package
-package org.mskcc.cbio.importer;
-
-// imports
-import javax.sql.DataSource;
-
-/**
- * Interface used to create database/database schema dynamically.
- */
-public interface DatabaseUtils {
-
- /**
- * Returns the database user credential.
- *
- * @return String
- */
- String getDatabaseUser();
-
- /**
- * Returns the database password credential.
- *
- * @return String
- */
- String getDatabasePassword();
-
- /**
- * Returns the database connection string.
- *
- * @return String
- */
- String getDatabaseConnectionString();
-
- /**
- * Returns the database schema filename.
- *
- * @return String
- */
- String getPortalDatabaseSchema();
-
- /**
- * Returns the importer database name.
- *
- * @return String
- */
- String getImporterDatabaseName();
-
- /**
- * Returns the portal database name.
- *
- * @return String
- */
- String getPortalDatabaseName();
-
- /**
- * Creates a database and optional schema.
- *
- * @param databaseName String
- * @param createSchema boolean
- */
- void createDatabase(String databaseName, boolean createSchema);
-
- /**
- * Execute the given script on the given db.
- *
- * @param databaseName String
- * @param databaseScript String
- * @param databaseUser String
- * @param databasePassword String
- */
- boolean executeScript(String databaseName, String databaseScript,
- String databaseUser, String databasePassword);
-}
diff --git a/importer/src/main/java/org/mskcc/cbio/importer/Fetcher.java b/importer/src/main/java/org/mskcc/cbio/importer/Fetcher.java
deleted file mode 100644
index 8addab371ea..00000000000
--- a/importer/src/main/java/org/mskcc/cbio/importer/Fetcher.java
+++ /dev/null
@@ -1,49 +0,0 @@
-/** Copyright (c) 2012 Memorial Sloan-Kettering Cancer Center.
- *
- * This library is distributed in the hope that it will be useful, but
- * WITHOUT ANY WARRANTY, WITHOUT EVEN THE IMPLIED WARRANTY OF
- * MERCHANTABILITY OR FITNESS FOR A PARTICULAR PURPOSE. The software and
- * documentation provided hereunder is on an "as is" basis, and
- * Memorial Sloan-Kettering Cancer Center
- * has no obligations to provide maintenance, support,
- * updates, enhancements or modifications. In no event shall
- * Memorial Sloan-Kettering Cancer Center
- * be liable to any party for direct, indirect, special,
- * incidental or consequential damages, including lost profits, arising
- * out of the use of this software and its documentation, even if
- * Memorial Sloan-Kettering Cancer Center
- * has been advised of the possibility of such damage.
-*/
-
-// package
-package org.mskcc.cbio.importer;
-
-// imports
-import org.mskcc.cbio.importer.model.ReferenceMetadata;
-
-/**
- * Interface used to retrieve portal data.
- */
-public interface Fetcher {
-
- // latest run indicator
- public static final String LATEST_RUN_INDICATOR = "latest";
-
- /**
- * Fetchers genomic data from an external datasource and
- * places in database for processing.
- *
- * @param dataSource String
- * @param desiredRunDate String
- * @throws Exception
- */
- void fetch(String dataSource, String desiredRunDate, boolean updateStudiesWorksheet) throws Exception;
-
- /**
- * Fetchers reference data from an external datasource.
- *
- * @param referenceMetadata ReferenceMetadata
- * @throws Exception
- */
- void fetchReferenceData(ReferenceMetadata referenceMetadata) throws Exception;
-}
diff --git a/importer/src/main/java/org/mskcc/cbio/importer/FileTransformer.java b/importer/src/main/java/org/mskcc/cbio/importer/FileTransformer.java
deleted file mode 100644
index b2dfcd57ae0..00000000000
--- a/importer/src/main/java/org/mskcc/cbio/importer/FileTransformer.java
+++ /dev/null
@@ -1,36 +0,0 @@
-/*
- * Copyright (c) 2014 Memorial Sloan-Kettering Cancer Center.
- *
- * This library is distributed in the hope that it will be useful, but
- * WITHOUT ANY WARRANTY, WITHOUT EVEN THE IMPLIED WARRANTY OF
- * MERCHANTABILITY OR FITNESS FOR A PARTICULAR PURPOSE. The software and
- * documentation provided hereunder is on an "as is" basis, and
- * Memorial Sloan-Kettering Cancer Center
- * has no obligations to provide maintenance, support,
- * updates, enhancements or modifications. In no event shall
- * Memorial Sloan-Kettering Cancer Center
- * be liable to any party for direct, indirect, special,
- * incidental or consequential damages, including lost profits, arising
- * out of the use of this software and its documentation, even if
- * Memorial Sloan-Kettering Cancer Center
- * has been advised of the possibility of such damage.
- */
-
-package org.mskcc.cbio.importer;
-
-import java.io.IOException;
-import java.nio.file.Path;
-import org.mskcc.cbio.importer.foundation.extractor.FileDataSource;
-
-/**
- *
- * @author criscuof
- */
-public interface FileTransformer {
-
- public void transform(Path aPath) throws IOException;
- public void transform(FileDataSource fds);
- public String getPrimaryIdentifier();
- public Integer getPrimaryEntityCount();
-
-}
diff --git a/importer/src/main/java/org/mskcc/cbio/importer/FileUtils.java b/importer/src/main/java/org/mskcc/cbio/importer/FileUtils.java
deleted file mode 100644
index 00b763d3ae5..00000000000
--- a/importer/src/main/java/org/mskcc/cbio/importer/FileUtils.java
+++ /dev/null
@@ -1,334 +0,0 @@
-/** Copyright (c) 2012 Memorial Sloan-Kettering Cancer Center.
- *
- * This library is distributed in the hope that it will be useful, but
- * WITHOUT ANY WARRANTY, WITHOUT EVEN THE IMPLIED WARRANTY OF
- * MERCHANTABILITY OR FITNESS FOR A PARTICULAR PURPOSE. The software and
- * documentation provided hereunder is on an "as is" basis, and
- * Memorial Sloan-Kettering Cancer Center
- * has no obligations to provide maintenance, support,
- * updates, enhancements or modifications. In no event shall
- * Memorial Sloan-Kettering Cancer Center
- * be liable to any party for direct, indirect, special,
- * incidental or consequential damages, including lost profits, arising
- * out of the use of this software and its documentation, even if
- * Memorial Sloan-Kettering Cancer Center
- * has been advised of the possibility of such damage.
-*/
-
-// package
-package org.mskcc.cbio.importer;
-
-// imports
-import org.mskcc.cbio.importer.CaseIDs;
-import org.mskcc.cbio.importer.model.*;
-
-import org.apache.commons.io.LineIterator;
-
-import java.io.*;
-import java.util.*;
-
-/**
- * Interface used to access some common file utils.
- */
-public interface FileUtils {
-
- public static final String FILE_URL_PREFIX = "file://";
- public static final String CASE_LIST_DIRECTORY_NAME = "case_lists";
-
- /**
- * Computes the MD5 digest for the given file.
- * Returns the 32 digit hexadecimal.
- *
- * @param file File
- * @return String
- * @throws Exception
- */
- String getMD5Digest(File file) throws Exception;
-
- /**
- * Reads the precomputed md5 digest out of a firehose .md5 file.
- *
- * @param file File
- * @return String
- * @throws Exception
- */
- String getPrecomputedMD5Digest(File file) throws Exception;
-
- /**
- * Makes a directory, including parent directories if necessary.
- *
- * @param directory File
- * @throws Exception
- */
- void makeDirectory(File directory) throws Exception;
-
- /**
- * Checks if directory is empty
- */
- boolean directoryIsEmpty(File directory) throws Exception;
-
- /**
- * Deletes a directory recursively.
- *
- * @param directory File
- * @throws Exception
- */
- void deleteDirectory(File directory) throws Exception;
-
- /**
- * Deletes a file.
- *
- * @param file File
- * @throws Exception
- */
- void deleteFile(File file) throws Exception;
-
- /**
- * Lists all files in a given directory and its subdirectories.
- *
- * @param directory File
- * @param extensions String[]
- * @param recursize boolean
- * @return Collection
- * @throws Exception
- */
- Collection listFiles(File directory, String[] extensions, boolean recursive) throws Exception;
- Collection listFiles(File directory, String wildcard) throws Exception;
-
- /**
- * Returns the contents of the datafile as specified by ImportDataRecord
- * in an DataMatrix. May return null if there is a problem reading the file.
- *
- * methylationCorrelation matrix is set when we are processing a methlation file.
- * These files can be extremely large, so the correlation file is used to skip
- * all rows in the methylation file that do not have a corresponding row in the correlate file.
- *
- * @param importDataRecord ImportDataRecord
- * @param methylationCorrelation DataMatrix
- * @return DataMatrix
- * @throws Exception
- */
- List getDataMatrices(ImportDataRecord importDataRecord, DataMatrix methylationCorrelation) throws Exception;
-
- /**
- * Returns a list of missing caselists. Applicable to
- * manually curated studies checked into a 'studies' directory
- *
- * @param stagingDirectory String
- * @param cancerStudyMetadata CancerStudyMetadata
- * @return List
- */
- List getMissingCaseListFilenames(String rootDirectory, CancerStudyMetadata cancerStudyMetadata) throws Exception;
-
- /**
- * Generates caselists for the given cancer study. If strict is false, a check of isTumorCaseID is skipped.
- * If overwrite is set, any existing caselist file will be clobbered.
- *
- * @param overwrite boolean
- * @param strict boolean
- * @param stagingDirectory String
- * @param cancerStudyMetadata CancerStudyMetadata
- * @throws Exception
- */
- void generateCaseLists(boolean overwrite, boolean strict, String stagingDirectory, CancerStudyMetadata cancerStudyMetadata) throws Exception;
-
- /**
- * Get the case list from the staging file. If strict is false, a check of isTumorCaseID is skipped.
- *
- * @param strict boolean
- * @param caseIDs CaseIDs;
- * @param cancerStudyMetadata CancerStudyMetadata
- * @param stagingDirectory String
- * @param stagingFilename String
- * @return List
- * @throws Exception
- */
- List getCaseListFromStagingFile(boolean strict, CaseIDs caseIDs, CancerStudyMetadata cancerStudyMetadata, String stagingDirectory, String stagingFilename) throws Exception;
-
- /**
- * Creates a temporary file with the given contents.
- *
- * @param filename String
- * @param fileContent String
- * @return File
- * @throws Exception
- */
- File createTmpFileWithContents(String filename, String fileContent) throws Exception;
-
- /**
- * Creates (or overwrites) the given file with the given contents. Filename
- * is canonical path/filename.
- *
- * @param filename String
- * @param fileContent String
- * @return File
- * @throws Exception
- */
- File createFileWithContents(String filename, String fileContent) throws Exception;
- File createFileFromStream(String filename, InputStream fileContent) throws Exception;
-
- /**
- * Downloads the given file specified via url to the given canonicalDestination.
- *
- * @param uriSource String
- * @param uriDestination String
- * @throws Exception
- */
- void downloadFile(String urlSource, String urlDestination) throws Exception;
-
- /**
- * Returns a line iterator over the given file.
- *
- * @param urlFile String
- * @throws Exception
- */
- LineIterator getFileContents(String urlFile) throws Exception;
-
- /**
- * Method which writes the cancer study metadata file.
- *
- * @param stagingDirectory String
- * @param cancerStudyMetadata CancerStudyMetadata
- * @param numCases int
- * @throws Exception
- */
- void writeCancerStudyMetadataFile(String stagingDirectory, CancerStudyMetadata cancerStudyMetadata, int numCases) throws Exception;
- void updateCancerStudyMetadataFile(String stagingDirectory, CancerStudyMetadata cancerStudyMetadata, Map properties) throws Exception;
-
- /**
- * Method which writes a metadata file for
- * the given Datatype metadata instance.
- *
- * @param stagingDirectory String
- * @param datatypeMetadata DatatypeMetadata
- * @param numCases int
- * @throws Exception
- */
- void writeMetadataFile(String stagingDirectory, CancerStudyMetadata cancerStudyMetadata, DatatypeMetadata datatypeMetadata, int numCases) throws Exception;
- void writeCopyNumberSegmentMetadataFile(String stagingDirectory, CancerStudyMetadata cancerStudyMetadata,
- DatatypeMetadata datatypeMetadata, DataMatrix dataMatrix) throws Exception;
-
- /**
- * Method which writes a metadata file for the
- * given DatatypeMetadata. DataMatrix may be null.
- *
- * @param stagingDirectory String
- * @param cancerStudyMetadata CancerStudyMetadata
- * @param datatypeMetadata DatatypeMetadata
- * @param dataMatrix DataMatrix
- * @throws Exception
- *
- */
- void writeMetadataFile(String stagingDirectory, CancerStudyMetadata cancerStudyMetadata,
- DatatypeMetadata datatypeMetadata, DataMatrix dataMatrix) throws Exception;
-
- /**
- * Creates a staging file (and meta file) with contents from the given DataMatrix.
- *
- * @param stagingDirectory String
- * @param cancerStudyMetadata CancerStudyMetadata
- * @param datatypeMetadata DatatypeMetadata
- * @param dataMatrix DataMatrix
- * @throws Exception
- */
- void writeStagingFile(String stagingDirectory, CancerStudyMetadata cancerStudyMetadata,
- DatatypeMetadata datatypeMetadata, DataMatrix dataMatrix) throws Exception;
-
- /**
- * Creates a staging file for mutation data (and meta file) with contents from the given DataMatrix.
- * This is called when the mutation file needs to be run through the Oncotator and Mutation Assessor Tools.
- *
- * @param stagingDirectory String
- * @param cancerStudy CancerStudyMetadata
- * @param datatypeMetadata DatatypeMetadata
- * @param dataMatrix DataMatrix
- * @throws Exception
- */
- void writeMutationStagingFile(String stagingDirectory, CancerStudyMetadata cancerStudyMetadata,
- DatatypeMetadata datatypeMetadata, DataMatrix dataMatrix) throws Exception;
-
- /**
- * Creates a z-score staging file from the given dependencies. It assumes that the
- * dependency - staging files have already been created.
- *
- * @param stagingDirectory String
- * @param cancerStudyMetadata CancerStudyMetadata
- * @param datatypeMetadata DatatypeMetadata
- * @param dependencies DatatypeMetadata[]
- * @throws Exception
- */
- boolean writeZScoresStagingFile(String stagingDirectory, CancerStudyMetadata cancerStudyMetadata,
- DatatypeMetadata datatypeMetadata, DatatypeMetadata[] dependencies) throws Exception;
-
- /**
- * Returns an override file (if it exists) for the given portal & cancer study. The override in this case
- * is the override file that a DataMatrix is created from.
- *
- * Null is returned if an override file is not found.
- *
- * @param portalMetadata PortalMetadata
- * @param cancerStudyMetadata CancerStudyMetadata
- * @param filename String
- * @return File
- * @throws Exception
- */
- File getOverrideFile(PortalMetadata portalMetadata, CancerStudyMetadata cancerStudyMetadata, String filename) throws Exception;
-
- /**
- * If it exists, moves an override file into the proper
- * location in the given portals staging area.
- *
- * Note, filename can be the name of a file or directory (like case_lists)
- *
- * @param overrideDirectory String
- * @param stagingDirectory String
- * @param cancerStudyMetadata CancerStudyMetadata
- * @param overrideFilename String
- * @param stagingFilename String
- * @throws Exception
- */
- void applyOverride(String overrideDirectory, String stagingDirectory, CancerStudyMetadata cancerStudyMetadata,
- String overrideFilename, String stagingFilename) throws Exception;
-
- /**
- * Create a case list file from the given case list metadata file.
- *
- * @param stagingDirectory String
- * @param cancerStudyMetadata CancerStudyMetadata
- * @param caseListMetadata CaseListMetadata
- * @param caseList String[]
- * @throws Exception
- */
- void writeCaseListFile(String stagingDirectory, CancerStudyMetadata cancerStudyMetadata, CaseListMetadata caseListMetadata, String[] caseList) throws Exception;
-
- /**
- * Runs all MAFs for the given dataaSourcesMetadata through
- * the Oncotator and OMA tools.
- *
- * @param dataSourcesMetadata DataSourcesMetadata
- * @throws Exception
- */
- void oncotateAllMAFs(DataSourcesMetadata dataSourcesMetadata) throws Exception;
-
- /**
- * Runs a MAF file through the Oncotator and OMA tools.
- *
- * @param inputMAFURL String
- * @param outputMAFURL String
- * @throws Exception
- */
- void oncotateMAF(String inputMAFURL, String outputMAFURL) throws Exception;
-
- /**
- * Copy's the given portal's seg files to location used for linking to IGV from cBio Portal web site.
- *
- * @param portalMetadata PortalMetadata
- * @param datatypeMetadata DatatypeMetadata
- * @param remoteUserName String
- * @throws Exception
- */
- void copySegFiles(PortalMetadata portalMetadata, DatatypeMetadata datatypeMetadata, String remoteUserName) throws Exception;
- void redeployWar(PortalMetadata portalMetadata) throws Exception;
-
- CancerStudyMetadata createCancerStudyMetadataFromMetaStudyFile(String downloadDirectory, String studyName);
-}
diff --git a/importer/src/main/java/org/mskcc/cbio/importer/IDMapper.java b/importer/src/main/java/org/mskcc/cbio/importer/IDMapper.java
deleted file mode 100644
index 8c9215e7be9..00000000000
--- a/importer/src/main/java/org/mskcc/cbio/importer/IDMapper.java
+++ /dev/null
@@ -1,65 +0,0 @@
-/** Copyright (c) 2012 Memorial Sloan-Kettering Cancer Center.
- *
- * This library is distributed in the hope that it will be useful, but
- * WITHOUT ANY WARRANTY, WITHOUT EVEN THE IMPLIED WARRANTY OF
- * MERCHANTABILITY OR FITNESS FOR A PARTICULAR PURPOSE. The software and
- * documentation provided hereunder is on an "as is" basis, and
- * Memorial Sloan-Kettering Cancer Center
- * has no obligations to provide maintenance, support,
- * updates, enhancements or modifications. In no event shall
- * Memorial Sloan-Kettering Cancer Center
- * be liable to any party for direct, indirect, special,
- * incidental or consequential damages, including lost profits, arising
- * out of the use of this software and its documentation, even if
- * Memorial Sloan-Kettering Cancer Center
- * has been advised of the possibility of such damage.
-*/
-
-// package
-package org.mskcc.cbio.importer;
-
-// imports
-
-import scala.Tuple2;
-
-/**
- * Interface used to map IDS.
- */
-public interface IDMapper {
-
- /**
- *
- * @param chromosome
- * @param position
- * @param strand
- * @return
- */
-
- public String findGeneNameByGenomicPosition(String chromosome, String position,String strand);
-
- /**
- * For the given symbol, return id.
- *
- * @param geneSymbol String
- * @return String
- * @throws Exception
- */
- String symbolToEntrezID(String geneSymbol) throws Exception;
-
- /**
- * For the entrezID, return symbol.
- *
- * @param entrezID String
- * @return String
- * @throws Exception
- */
- String entrezIDToSymbol(String entrezID) throws Exception;
-
- /**
- * returns the Gene Symbol and Entrez ID for a specified Ensembl ID
- * @param ensemblID
- * @return
- */
-
- Tuple2 ensemblToHugoSymbolAndEntrezID(String ensemblID);
-}
diff --git a/importer/src/main/java/org/mskcc/cbio/importer/Importer.java b/importer/src/main/java/org/mskcc/cbio/importer/Importer.java
deleted file mode 100644
index 506ac11830c..00000000000
--- a/importer/src/main/java/org/mskcc/cbio/importer/Importer.java
+++ /dev/null
@@ -1,67 +0,0 @@
-/** Copyright (c) 2012 Memorial Sloan-Kettering Cancer Center.
- *
- * This library is distributed in the hope that it will be useful, but
- * WITHOUT ANY WARRANTY, WITHOUT EVEN THE IMPLIED WARRANTY OF
- * MERCHANTABILITY OR FITNESS FOR A PARTICULAR PURPOSE. The software and
- * documentation provided hereunder is on an "as is" basis, and
- * Memorial Sloan-Kettering Cancer Center
- * has no obligations to provide maintenance, support,
- * updates, enhancements or modifications. In no event shall
- * Memorial Sloan-Kettering Cancer Center
- * be liable to any party for direct, indirect, special,
- * incidental or consequential damages, including lost profits, arising
- * out of the use of this software and its documentation, even if
- * Memorial Sloan-Kettering Cancer Center
- * has been advised of the possibility of such damage.
-*/
-
-// package
-package org.mskcc.cbio.importer;
-
-// imports
-import org.mskcc.cbio.importer.model.*;
-
-/**
- * Interface used to import portal data.
- */
-public interface Importer {
-
- /**
- * Imports data for use in the given portal.
- *
- * @param portal String
- * @param initPortalDatabase Boolean
- * @param initTumorTypes Boolean
- * @param importReferenceData Boolean
- * @throws Exception
- */
- void importData(String portal, Boolean initPortalDatabase, Boolean initTumorTypes, Boolean importReferenceData) throws Exception;
- void updateCancerStudy(String portal, CancerStudyMetadata cancerStudyMetadata) throws Exception;
-
- /**
- * Imports the given reference data.
- *
- * @param referenceMetadata ReferenceMetadata
- * @throws Exception
- */
- void importReferenceData(ReferenceMetadata referenceMetadata) throws Exception;
-
- /**
- * Imports tumor type metadata.
- */
- void importTypesOfCancer() throws Exception;
-
- void importCaseLists(String portal) throws Exception;
-
- /**
- * Imports all cancer studies found within the given directory.
- * If force is set, user will not be prompted to override existing cancer study.
- * If cancer study exists and skip is set, new study will not be imported.
- *
- * @param cancerStudyDirectoryName
- * @param skip
- * @param force
-
- */
- void importCancerStudy(String cancerStudyDirectoryName, boolean skip, boolean force) throws Exception;
-}
diff --git a/importer/src/main/java/org/mskcc/cbio/importer/NCIcaDSRFetcher.java b/importer/src/main/java/org/mskcc/cbio/importer/NCIcaDSRFetcher.java
deleted file mode 100644
index 2ec4b6c8816..00000000000
--- a/importer/src/main/java/org/mskcc/cbio/importer/NCIcaDSRFetcher.java
+++ /dev/null
@@ -1,26 +0,0 @@
-/** Copyright (c) 2012 Memorial Sloan-Kettering Cancer Center.
- *
- * This library is distributed in the hope that it will be useful, but
- * WITHOUT ANY WARRANTY, WITHOUT EVEN THE IMPLIED WARRANTY OF
- * MERCHANTABILITY OR FITNESS FOR A PARTICULAR PURPOSE. The software and
- * documentation provided hereunder is on an "as is" basis, and
- * Memorial Sloan-Kettering Cancer Center
- * has no obligations to provide maintenance, support,
- * updates, enhancements or modifications. In no event shall
- * Memorial Sloan-Kettering Cancer Center
- * be liable to any party for direct, indirect, special,
- * incidental or consequential damages, including lost profits, arising
- * out of the use of this software and its documentation, even if
- * Memorial Sloan-Kettering Cancer Center
- * has been advised of the possibility of such damage.
-*/
-
-// package
-package org.mskcc.cbio.importer;
-
-import org.mskcc.cbio.importer.model.NCIcaDSREntry;
-
-public interface NCIcaDSRFetcher
-{
- public NCIcaDSREntry fetchDSREntry(String cdiId);
-}
diff --git a/importer/src/main/java/org/mskcc/cbio/importer/PortalImporterTool.java b/importer/src/main/java/org/mskcc/cbio/importer/PortalImporterTool.java
deleted file mode 100644
index 9c6d53bccca..00000000000
--- a/importer/src/main/java/org/mskcc/cbio/importer/PortalImporterTool.java
+++ /dev/null
@@ -1,289 +0,0 @@
-/** Copyright (c) 2012 Memorial Sloan-Kettering Cancer Center.
- *
- * This library is distributed in the hope that it will be useful, but
- * WITHOUT ANY WARRANTY, WITHOUT EVEN THE IMPLIED WARRANTY OF
- * MERCHANTABILITY OR FITNESS FOR A PARTICULAR PURPOSE. The software and
- * documentation provided hereunder is on an "as is" basis, and
- * Memorial Sloan-Kettering Cancer Center
- * has no obligations to provide maintenance, support,
- * updates, enhancements or modifications. In no event shall
- * Memorial Sloan-Kettering Cancer Center
- * be liable to any party for direct, indirect, special,
- * incidental or consequential damages, including lost profits, arising
- * out of the use of this software and its documentation, even if
- * Memorial Sloan-Kettering Cancer Center
- * has been advised of the possibility of such damage.
-*/
-
-// package
-package org.mskcc.cbio.importer;
-
-// imports
-import org.mskcc.cbio.portal.dao.DaoCancerStudy;
-import org.mskcc.cbio.portal.scripts.NormalizeExpressionLevels;
-
-import org.apache.commons.cli.*;
-
-import org.apache.commons.logging.*;
-import org.apache.log4j.PropertyConfigurator;
-
-import org.springframework.context.ApplicationContext;
-import org.springframework.context.support.ClassPathXmlApplicationContext;
-
-import java.io.*;
-import java.util.Properties;
-
-/**
- * Class which provides command line admin capabilities
- * to the importer tool.
- */
-public class PortalImporterTool implements Runnable {
-
- private static final String HOME_DIR = "PORTAL_HOME";
- private static final Log LOG = LogFactory.getLog(PortalImporterTool.class);
- static {
- configureLogging();
- }
- private static final String contextFile = "classpath:applicationContext-portalImporterTool.xml";
- private static final ApplicationContext context = new ClassPathXmlApplicationContext(contextFile);
- private static final Options options = initializeOptions();
-
- private CommandLine commandLine;
-
- private static Options initializeOptions()
- {
-
- // create each option
- Option help = new Option("h", "Print this message.");
-
- Option annotateMAF = (OptionBuilder.withArgName("maf:output")
- .hasArgs(2)
- .withValueSeparator(':')
- .withDescription("Annotates the MAF file with additional information from mutationassessor.org and Oncotator." +
- "If output filename is not given, input filename will be used with a '.annotated' extension.")
- .create("a"));
-
- Option validateCancerStudy = (OptionBuilder.withArgName("dir")
- .hasArg()
- .withDescription("Validates cancer studies within the given cancer study directory.")
- .create("v"));
-
- Option normalizeDataFile = (OptionBuilder.withArgName("cna-file:expression-file:output-file:normal-sample-suffix")
- .hasArgs(4)
- .withValueSeparator(':')
- .withDescription("Given CNV & expression data for a set of samples, generate normalized expression values.")
- .create("n"));
-
- Option importCancerStudy = (OptionBuilder.withArgName("dir:skip:force")
- .hasArgs(3)
- .withValueSeparator(':')
- .withDescription("Import cancer study data into the database. " +
- "This command will traverse all subdirectories of cancer_study_directory " +
- "looking for cancer studies to import. If the skip argument is 't', " +
- "cancer studies will not be replaced. Set force to 't' to force a cancer study replacement.")
- .create("i"));
-
- Option deleteCancerStudy = (OptionBuilder.withArgName("cancer_study_id")
- .hasArg()
- .withDescription("Delete a cancer study matching the given cancer study id.")
- .create("d"));
-
-
- // create an options instance
- Options toReturn = new Options();
-
- // add options
- toReturn.addOption(help);
- toReturn.addOption(annotateMAF);
- toReturn.addOption(validateCancerStudy);
- toReturn.addOption(normalizeDataFile );
- toReturn.addOption(importCancerStudy);
- toReturn.addOption(deleteCancerStudy);
-
- // outta here
- return toReturn;
- }
-
- public void setCommandParameters(String[] args)
- {
- // create our parser
- CommandLineParser parser = new PosixParser();
-
- // parse
- try {
- commandLine = parser.parse(options, args);
- }
- catch (Exception e) {
- Admin.usage(new PrintWriter(System.out, true));
- }
- }
-
- public static void usage(PrintWriter writer)
- {
-
- HelpFormatter formatter = new HelpFormatter();
- formatter.printHelp(writer, 100,
- "PortalImporterTool", "", options,
- HelpFormatter.DEFAULT_LEFT_PAD,
- HelpFormatter.DEFAULT_DESC_PAD, "");
- }
-
- @Override
- public void run()
- {
- if (commandLine == null) return;
-
- try {
- if (commandLine.hasOption("h")) {
- Admin.usage(new PrintWriter(System.out, true));
- }
- else if (commandLine.hasOption("v")) {
- validateCancerStudy(commandLine.getOptionValue("v"));
- }
- else if (commandLine.hasOption("n")) {
- String[] values = commandLine.getOptionValues("n");
- normalizeExpressionLevels(values[0], values[1], values[2], values[3]);
- }
- else if (commandLine.hasOption("i")) {
- String[] values = commandLine.getOptionValues("i");
- importCancerStudy(values[0], (values.length >= 2) ? values[1] : "", (values.length == 3) ? values[2] : "");
- }
- else if (commandLine.hasOption("a")) {
- String[] values = commandLine.getOptionValues("a");
- annotateMAF(values[0], (values.length == 2) ? values[1] : values[0] + ".annotated");
- }
- else if (commandLine.hasOption("d")) {
- deleteCancerStudy(commandLine.getOptionValue("d"));
- }
- else {
- Admin.usage(new PrintWriter(System.out, true));
- }
- }
- catch (Exception e) {
- e.printStackTrace();
- }
- }
-
- public static void main(String[] args) throws Exception
- {
- if (args.length == 0) {
- System.err.println("Missing args to PortalImporterTool.");
- PortalImporterTool.usage(new PrintWriter(System.err, true));
- return;
- }
-
- // process
- PortalImporterTool importer = new PortalImporterTool();
- try {
- importer.setCommandParameters(args);
- importer.run();
- }
- catch (Exception e) {
- e.printStackTrace();
- }
- }
-
- private static void configureLogging()
- {
- String propertyFilename = "log4j.properties";
-
- try {
- String home = System.getenv(HOME_DIR);
- if (home != null) {
- propertyFilename = home + File.separator + "log4j.properties";
- InputStream fis = new FileInputStream(propertyFilename);
- Properties props = new Properties();
- props.load(fis);
- fis.close();
- PropertyConfigurator.configure(props);
- }
- }
- catch(IOException e) {
- System.err.println("Error loading: " + propertyFilename);
- }
- }
-
- private void validateCancerStudy(String cancerStudyDirectory) throws Exception
- {
- logMessage("validateCancerStudy(), cancer study directory: " + cancerStudyDirectory);
-
- Validator validator = (Validator)context.getBean("cancerStudyValidator");
- validator.validateCancerStudy(cancerStudyDirectory);
-
- logMessage("validateCancerStudy(), complete");
- }
-
- private void importCancerStudy(String cancerStudyDirectory, String skip, String force) throws Exception
- {
-
- logMessage("importCancerStudy(), cancer study directory: " + cancerStudyDirectory);
-
- boolean skipBool = getBoolean(skip);
- boolean forceBool = getBoolean(force);
- Importer importer = (Importer)context.getBean("cancerStudyImporter");
- importer.importCancerStudy(cancerStudyDirectory, skipBool, forceBool);
-
- logMessage("importCancerStudy(), complete");
- }
-
- private void annotateMAF(String inputFilename, String outputFilename) throws Exception {
-
- logMessage("annotateMAF(), mafFile: " + inputFilename);
-
- // sanity check
- File mafFile = new File(inputFilename);
- if (!mafFile.exists()) {
- throw new IllegalArgumentException("cannot find the give MAF: " + inputFilename);
- }
-
- // create fileUtils object
- FileUtils fileUtils = (FileUtils)context.getBean("fileUtils");
-
- // create output file
- File outputMAF =
- org.apache.commons.io.FileUtils.getFile(outputFilename);
-
- fileUtils.oncotateMAF(FileUtils.FILE_URL_PREFIX + mafFile.getCanonicalPath(),
- FileUtils.FILE_URL_PREFIX + outputMAF.getCanonicalPath());
-
- logMessage("annotateMAF(), complete");
- }
-
- private void normalizeExpressionLevels(String cnaFile, String expressionFile, String normalizedFile, String normalSampleSuffix) throws Exception
- {
- logMessage("normalizeExpressionLevels()");
- logMessage("cnaFile: " + cnaFile);
- logMessage("expressionFile: " + expressionFile);
- logMessage("outputFile: " + normalizedFile);
- logMessage("normalSampleSuffix: " + normalSampleSuffix);
-
- String[] args = { cnaFile, expressionFile, normalizedFile, normalSampleSuffix };
- NormalizeExpressionLevels.driver(args);
-
- logMessage("normalizeExpressionLevels(), complete");
- }
-
- private void deleteCancerStudy(String cancerStudyStableId) throws Exception
- {
- if (LOG.isInfoEnabled()) {
- LOG.info("deleteCancerStudy(), study id: " + cancerStudyStableId);
- }
- DaoCancerStudy.deleteCancerStudy(cancerStudyStableId);
- if (LOG.isInfoEnabled()) {
- LOG.info("deleteCancerStudy(), complete");
- }
- }
-
- private boolean getBoolean(String parameterValue)
- {
- return (parameterValue.equalsIgnoreCase("t")) ? Boolean.TRUE : Boolean.FALSE;
- }
-
- private void logMessage(String message)
- {
- if (LOG.isInfoEnabled()) {
- LOG.info(message);
- }
- System.err.println(message);
- }
-}
diff --git a/importer/src/main/java/org/mskcc/cbio/importer/SurvivalDataCalculator.java b/importer/src/main/java/org/mskcc/cbio/importer/SurvivalDataCalculator.java
deleted file mode 100644
index c33bb6f459e..00000000000
--- a/importer/src/main/java/org/mskcc/cbio/importer/SurvivalDataCalculator.java
+++ /dev/null
@@ -1,36 +0,0 @@
-/** Copyright (c) 2013 Memorial Sloan-Kettering Cancer Center.
- *
- * This library is distributed in the hope that it will be useful, but
- * WITHOUT ANY WARRANTY, WITHOUT EVEN THE IMPLIED WARRANTY OF
- * MERCHANTABILITY OR FITNESS FOR A PARTICULAR PURPOSE. The software and
- * documentation provided hereunder is on an "as is" basis, and
- * Memorial Sloan-Kettering Cancer Center
- * has no obligations to provide maintenance, support,
- * updates, enhancements or modifications. In no event shall
- * Memorial Sloan-Kettering Cancer Center
- * be liable to any party for direct, indirect, special,
- * incidental or consequential damages, including lost profits, arising
- * out of the use of this software and its documentation, even if
- * Memorial Sloan-Kettering Cancer Center
- * has been advised of the possibility of such damage.
-*/
-
-// package
-package org.mskcc.cbio.importer;
-
-import org.mskcc.cbio.importer.model.DataMatrix;
-import org.mskcc.cbio.importer.model.SurvivalStatus;
-
-import java.util.List;
-
-/**
- * Interface used to import portal data.
- */
-public interface SurvivalDataCalculator
-{
- /**
- * The list is in ascending (time) order,
- * i.e., patient matrix would come before follow-up matrices.
- */
- SurvivalStatus computeSurvivalData(List dataMatrices);
-}
diff --git a/importer/src/main/java/org/mskcc/cbio/importer/Validator.java b/importer/src/main/java/org/mskcc/cbio/importer/Validator.java
deleted file mode 100644
index 23bac529897..00000000000
--- a/importer/src/main/java/org/mskcc/cbio/importer/Validator.java
+++ /dev/null
@@ -1,59 +0,0 @@
-/** Copyright (c) 2012 Memorial Sloan-Kettering Cancer Center.
- *
- * This library is distributed in the hope that it will be useful, but
- * WITHOUT ANY WARRANTY, WITHOUT EVEN THE IMPLIED WARRANTY OF
- * MERCHANTABILITY OR FITNESS FOR A PARTICULAR PURPOSE. The software and
- * documentation provided hereunder is on an "as is" basis, and
- * Memorial Sloan-Kettering Cancer Center
- * has no obligations to provide maintenance, support,
- * updates, enhancements or modifications. In no event shall
- * Memorial Sloan-Kettering Cancer Center
- * be liable to any party for direct, indirect, special,
- * incidental or consequential damages, including lost profits, arising
- * out of the use of this software and its documentation, even if
- * Memorial Sloan-Kettering Cancer Center
- * has been advised of the possibility of such damage.
-*/
-
-// package
-package org.mskcc.cbio.importer;
-
-/**
- * Interface used to validate cancer study import.
- */
-public interface Validator {
-
- /**
- * Validates all cancers studies fonud within the given directory.
- *
- * Validates:
- *
- * - meta_study.txt exists and is 'valid'
- * -- type of cancer is set
- * -- cancer study identifier is set
- * -- name is set
- * -- description is set
- *
- * - cancer_type.txt exists and is 'valid'
- * -- validates types of cancer id found in meta_study.txt
- *
- * - validates cancer study data:
- * -- for each metadata file found:
- * --- validates all properties are set
- * --- validates proper cancer study id
- * --- validates proper genetic alteration type
- * --- validates proper stable id (prefix matches cancer study id)
- * --- validates no duplicate stable ids
- * --- existence of staging file
- *
- * - validates case list directory exists and contains case lists
- * -- for each case list:
- * --- validates all properties are set
- * --- validates proper cancer study id
- * --- validates proper stable id (prefix matches cancer study id)
- * --- validates no duplicate stable ids
- *
- * @param cancerStudyDirectoryName
- */
- boolean validateCancerStudy(String cancerStudyDirectoryName) throws Exception;
-}
diff --git a/importer/src/main/java/org/mskcc/cbio/importer/caseids/internal/CaseIDsImpl.java b/importer/src/main/java/org/mskcc/cbio/importer/caseids/internal/CaseIDsImpl.java
deleted file mode 100644
index 45b42f19015..00000000000
--- a/importer/src/main/java/org/mskcc/cbio/importer/caseids/internal/CaseIDsImpl.java
+++ /dev/null
@@ -1,179 +0,0 @@
-/** Copyright (c) 2012 Memorial Sloan-Kettering Cancer Center.
- *
- * This library is distributed in the hope that it will be useful, but
- * WITHOUT ANY WARRANTY, WITHOUT EVEN THE IMPLIED WARRANTY OF
- * MERCHANTABILITY OR FITNESS FOR A PARTICULAR PURPOSE. The software and
- * documentation provided hereunder is on an "as is" basis, and
- * Memorial Sloan-Kettering Cancer Center
- * has no obligations to provide maintenance, support,
- * updates, enhancements or modifications. In no event shall
- * Memorial Sloan-Kettering Cancer Center
- * be liable to any party for direct, indirect, special,
- * incidental or consequential damages, including lost profits, arising
- * out of the use of this software and its documentation, even if
- * Memorial Sloan-Kettering Cancer Center
- * has been advised of the possibility of such damage.
-*/
-
-// package
-package org.mskcc.cbio.importer.caseids.internal;
-
-// imports
-import org.mskcc.cbio.importer.*;
-import org.mskcc.cbio.importer.model.*;
-
-import org.mskcc.cbio.portal.dao.*;
-import org.mskcc.cbio.portal.model.*;
-
-import org.apache.commons.logging.*;
-
-import java.util.*;
-import java.util.regex.*;
-
-/**
- * Class which implements the CaseIDs interface.
- */
-public class CaseIDsImpl implements CaseIDs {
-
- private static final String SAMPLE_REGEX = "tcga-sample-pattern";
- private static final String PATIENT_REGEX = "tcga-patient-pattern";
- private static final String TRUCATED_PATIENT_REGEX = "tcga-truncated-patient-pattern";
- private static final String NON_TCGA_REGEX = "non-tcga-pattern";
-
- private static final List tcgaNormalTypes = initTCGANormalTypes();
- private static final List initTCGANormalTypes()
- {
- return Arrays.asList(new String[] { "10","11","12","13","14","15","16","17","18","19" });
- }
-
- // ref to our matchers
- private Pattern samplePattern;
- private Pattern patientPattern;
- private Pattern truncatedTCGAPatientPattern;
- private Pattern nonTCGAPattern;
-
- /**
- * Constructor.
- *
- * @param config Config
- */
- public CaseIDsImpl(Config config) {
-
- // get all the filters
- Collection caseIDFilters = config.getCaseIDFilterMetadata(Config.ALL);
-
- // sanity check
- if (caseIDFilters == null) {
- throw new IllegalArgumentException("cannot instantiate a proper collection of CaseIDFilterMetadata objects.");
- }
-
- // setup our matchers
- for (CaseIDFilterMetadata caseIDFilter : caseIDFilters) {
- if (caseIDFilter.getFilterName().equals(PATIENT_REGEX)) {
- patientPattern = Pattern.compile(caseIDFilter.getRegex());
- }
- else if (caseIDFilter.getFilterName().equals(TRUCATED_PATIENT_REGEX)) {
- truncatedTCGAPatientPattern = Pattern.compile(caseIDFilter.getRegex());
- }
- else if (caseIDFilter.getFilterName().equals(SAMPLE_REGEX)) {
- samplePattern = Pattern.compile(caseIDFilter.getRegex());
- }
- else if (caseIDFilter.getFilterName().equals(NON_TCGA_REGEX)) {
- nonTCGAPattern = Pattern.compile(caseIDFilter.getRegex());
- }
- }
- }
-
- @Override
- public boolean isSampleId(String caseId)
- {
- return isSampleId(0, caseId);
- }
-
- @Override
- public boolean isSampleId(int cancerStudyId, String caseId)
- {
- if (nonTCGAPattern.matcher(caseId).matches()) {
- Sample s = DaoSample.getSampleByCancerStudyAndSampleId(cancerStudyId, caseId);
- return (s != null);
- }
- else {
- caseId = clean(caseId);
- return (samplePattern.matcher(caseId).matches());
- }
- }
-
- @Override
- public boolean isNormalId(String caseId)
- {
- String cleanId = clean(caseId);
- Matcher matcher = samplePattern.matcher(cleanId);
- return (matcher.find()) ? tcgaNormalTypes.contains(matcher.group(2)) : false;
- }
-
- @Override
- public boolean isTruncatedTCGAPatientId(String caseId)
- {
- return truncatedTCGAPatientPattern.matcher(caseId).matches();
- }
-
- @Override
- public String getSampleId(String caseId)
- {
- return getSampleId(0, caseId);
- }
-
- @Override
- public String getSampleId(int cancerStudyId, String caseId)
- {
- if (nonTCGAPattern.matcher(caseId).matches()) {
- Sample s = DaoSample.getSampleByCancerStudyAndSampleId(cancerStudyId, caseId);
- return (s != null) ? s.getStableId() : caseId;
- }
- else {
- String cleanId = clean(caseId);
- Matcher matcher = samplePattern.matcher(cleanId);
- return (matcher.find()) ? matcher.group(1) : caseId;
- }
- }
-
- @Override
- public String getPatientId(String caseId)
- {
- return getPatientId(0, caseId);
- }
-
- @Override
- public String getPatientId(int cancerStudyId, String caseId)
- {
- if (nonTCGAPattern.matcher(caseId).matches()) {
- // data files should only have sample ids - get patient id via sample
- Sample s = DaoSample.getSampleByCancerStudyAndSampleId(cancerStudyId, caseId);
- if (s != null && s.getInternalPatientId() > 0) {
- Patient p = DaoPatient.getPatientById(s.getInternalPatientId());
- return (p != null) ? p.getStableId() : caseId;
- }
- else {
- return caseId;
- }
- }
- else {
- String cleanId = clean(caseId);
- Matcher matcher = patientPattern.matcher(cleanId);
- return (matcher.find()) ? matcher.group(1) : caseId;
- }
- }
-
- private String clean(String caseId)
- {
- if (caseId.contains("Tumor")) {
- return caseId.replace("Tumor", "01");
- }
- else if (caseId.contains("Normal")) {
- return caseId.replace("Normal", "11");
- }
- else {
- return caseId;
- }
- }
-}
\ No newline at end of file
diff --git a/importer/src/main/java/org/mskcc/cbio/importer/config/internal/GDataImpl.java b/importer/src/main/java/org/mskcc/cbio/importer/config/internal/GDataImpl.java
deleted file mode 100644
index 1ac06eed54c..00000000000
--- a/importer/src/main/java/org/mskcc/cbio/importer/config/internal/GDataImpl.java
+++ /dev/null
@@ -1,1148 +0,0 @@
-/**
- * Copyright (c) 2012 Memorial Sloan-Kettering Cancer Center.
- *
- * This library is distributed in the hope that it will be useful, but WITHOUT
- * ANY WARRANTY, WITHOUT EVEN THE IMPLIED WARRANTY OF MERCHANTABILITY OR FITNESS
- * FOR A PARTICULAR PURPOSE. The software and documentation provided hereunder
- * is on an "as is" basis, and Memorial Sloan-Kettering Cancer Center has no
- * obligations to provide maintenance, support, updates, enhancements or
- * modifications. In no event shall Memorial Sloan-Kettering Cancer Center be
- * liable to any party for direct, indirect, special, incidental or
- * consequential damages, including lost profits, arising out of the use of this
- * software and its documentation, even if Memorial Sloan-Kettering Cancer
- * Center has been advised of the possibility of such damage.
- */
-// package
-package org.mskcc.cbio.importer.config.internal;
-
-// imports
-import org.mskcc.cbio.importer.Config;
-import org.mskcc.cbio.importer.model.*;
-import org.mskcc.cbio.importer.NCIcaDSRFetcher;
-import org.mskcc.cbio.importer.util.ClassLoader;
-
-import org.apache.commons.logging.Log;
-import org.apache.commons.logging.LogFactory;
-
-import com.google.common.base.Strings;
-import com.google.gdata.data.spreadsheet.*;
-import com.google.gdata.client.spreadsheet.*;
-import com.google.gdata.util.common.base.Preconditions;
-import com.google.common.collect.Lists;
-
-
-import java.util.*;
-import java.util.Calendar;
-import java.lang.reflect.Method;
-import java.util.regex.Matcher;
-import java.util.regex.Pattern;
-
-/**
- * Class which implements the Config interface using google docs as a backend.
- */
-class GDataImpl implements Config {
-
- // our logger
- private static Log LOG = LogFactory.getLog(GDataImpl.class);
-
- // google docs user
- private String gdataUser;
- // google docs password
- private String gdataPassword;
- // ref to spreadsheet client
- private SpreadsheetService spreadsheetService;
- private NCIcaDSRFetcher nciDSRFetcher;
-
- // for performance optimization - we try to limit the number of accesses to google
- ArrayList> cancerStudiesMatrix;
- ArrayList> caseIDFiltersMatrix;
- ArrayList> caseListMatrix;
- ArrayList> clinicalAttributesNamespaceMatrix;
- ArrayList> clinicalAttributesMatrix;
- ArrayList> datatypesMatrix;
- ArrayList> dataSourcesMatrix;
- ArrayList> portalsMatrix;
- ArrayList> referenceMatrix;
- ArrayList> oncotreeMatrix;
- ArrayList> oncotreePropertyMatrix;
- ArrayList> tcgaTumorTypesMatrix;
- ArrayList> foundationMatrix;
- ArrayList> icgcMatrix;
-
- // worksheet names we need for updates
- private String gdataSpreadsheet;
- private String oncotreeWorksheet;
- private String oncotreePropertyWorksheet;
- private String datatypesWorksheet;
- private String caseIDFiltersWorksheet;
- private String caseListWorksheet;
- private String clinicalAttributesNamespaceWorksheet;
- private String clinicalAttributesWorksheet;
- private String portalsWorksheet;
- private String referenceDataWorksheet;
- private String dataSourcesWorksheet;
- private String cancerStudiesWorksheet;
- private String foundationWorksheet;
- private String tcgaTumorTypesWorksheet;
- private String icgcWorksheet;
-
- private final String HTML_COLOR_NAME_ONCOTREE_PROP = "HTML_COLOR_NAME";
-
- /**
- * Constructor.
- *
- * Constructor args are passed viaw applicationContext. We do this so that all our
- * metadata objects can be retrieved during construction of this class. Which will
- * prevent us from having to access google more than once. Of course any changes to
- * the google docs will not be reflected in this class until its next instantiation.
- */
- public GDataImpl(String gdataUser, String gdataPassword, SpreadsheetService spreadsheetService,
- String gdataSpreadsheet,
- String oncotreeWorksheet, String oncotreePropertyWorksheet,
- String datatypesWorksheet,
- String caseIDFiltersWorksheet, String caseListWorksheet,
- String clinicalAttributesNamespaceWorksheet, String clinicalAttributesWorksheet,
- String portalsWorksheet, String referenceDataWorksheet, String dataSourcesWorksheet, String cancerStudiesWorksheet,
- String foundationWorksheet, String icgcWorksheet, String tcgaTumorTypesWorksheet, NCIcaDSRFetcher nciDSRFetcher)
- {
- // set members
- this.gdataUser = gdataUser;
- this.gdataPassword = gdataPassword;
- this.spreadsheetService = spreadsheetService;
- this.nciDSRFetcher = nciDSRFetcher;
-
- // save name(s) of worksheet we update later
- this.gdataSpreadsheet = gdataSpreadsheet;
- this.oncotreeWorksheet = oncotreeWorksheet;
- this.oncotreePropertyWorksheet = oncotreePropertyWorksheet;
- this.datatypesWorksheet = datatypesWorksheet;
- this.caseIDFiltersWorksheet = caseIDFiltersWorksheet;
- this.caseListWorksheet = caseListWorksheet;
- this.clinicalAttributesNamespaceWorksheet = clinicalAttributesNamespaceWorksheet;
- this.clinicalAttributesWorksheet = clinicalAttributesWorksheet;
- this.portalsWorksheet = portalsWorksheet;
- this.referenceDataWorksheet = referenceDataWorksheet;
- this.dataSourcesWorksheet = dataSourcesWorksheet;
- this.cancerStudiesWorksheet = cancerStudiesWorksheet;
- this.foundationWorksheet = foundationWorksheet;
- this.icgcWorksheet = icgcWorksheet;
- this.tcgaTumorTypesWorksheet = tcgaTumorTypesWorksheet;
- }
-
- /**
- * Function to get tumor types to download as String[]
- *
- * @return String[]
- */
- @Override
- public String[] getTumorTypesToDownload() {
-
- String toReturn = "";
- for (TCGATumorTypeMetadata tcgaTumorTypeMetadata : getTCGATumorTypeMetadata()) {
- toReturn += tcgaTumorTypeMetadata.getTCGACode() + ":";
- }
-
- // outta here
- return toReturn.split(":");
- }
-
- private Collection getTCGATumorTypeMetadata()
- {
- if (tcgaTumorTypesMatrix == null) {
- tcgaTumorTypesMatrix = getWorksheetData(gdataSpreadsheet, tcgaTumorTypesWorksheet);
- }
-
- return (Collection)getMetadataCollection(tcgaTumorTypesMatrix,
- "org.mskcc.cbio.importer.model.TCGATumorTypeMetadata");
- }
-
- public TCGATumorTypeMetadata getTCGATumorTypeMetadata(String oncotreeCode)
- {
- for (TCGATumorTypeMetadata md : getTCGATumorTypeMetadata()) {
- if (md.getOncoTreeCode().equals(oncotreeCode)) {
- return md;
- }
- }
- return null;
- }
-
- /**
- * Gets a TumorTypeMetadata object via tumorType.
- * If tumorType == Config.ALL, all are returned.
- *
- * @param tumortype String
- * @return TumorTypeMetadata
- */
- private String[] extractTumorTypeData(String dataCell) {
- String[] ret = new String[2];
- if (dataCell.contains("(") && dataCell.contains(")")) {
- String[] splitCell = dataCell.split("\\(");
- ret[0] = splitCell[0].trim();
- ret[1] = splitCell[1].split("\\)")[0];
- } else {
- // tissue
- ret[0] = dataCell;
- ret[1] = "";
- }
- return ret;
- }
-
- private TumorTypeMetadata parseTumorTypeMetadata(ArrayList line, int index, HashMap> propertyMap) {
- int newEntIndex = index;
- String newEnt = line.get(newEntIndex).trim();
- if (newEnt.isEmpty()) {
- return null;
- }
- String parentEnt = newEntIndex==0?"tissue":line.get(newEntIndex - 1).trim();
-
- String tissue = line.get(0);
- String color = propertyMap.get(tissue).get(HTML_COLOR_NAME_ONCOTREE_PROP);
- String[] newEntData = extractTumorTypeData(newEnt);
- String name = newEntData[0];
- String id = newEntData[1];
- if (id.isEmpty()) {
- id = name;
- }
- String[] parentEntData = extractTumorTypeData(parentEnt);
- String parent = parentEntData[1];
- if (parent.isEmpty()) {
- parent = parentEntData[0];
- }
- String clinicalTrialKeywords = name.toLowerCase();
- return new TumorTypeMetadata(id, name, color, parent, clinicalTrialKeywords, tissue);
- }
-
- @Override
- public Collection getTumorTypeMetadata(String tumorType) {
-
- Collection toReturn = new ArrayList();
-
- if (oncotreeMatrix == null) {
- oncotreeMatrix = getWorksheetData(gdataSpreadsheet, oncotreeWorksheet);
- }
- if (oncotreePropertyMatrix == null) {
- oncotreePropertyMatrix = getWorksheetData(gdataSpreadsheet, oncotreePropertyWorksheet);
- }
- HashMap> propertyMap = new HashMap<>();
- for (int i=1; i line = oncotreePropertyMatrix.get(i);
- String nodeName = line.get(0);
- String propertyName = line.get(1);
- String propertyValue = line.get(2);
- if (!propertyMap.containsKey(nodeName)) {
- propertyMap.put(nodeName, new HashMap());
- }
- propertyMap.get(nodeName).put(propertyName, propertyValue);
- }
-
- HashMap tumorTypes = new HashMap<>();
- int endOfData = 0;
- ArrayList line = oncotreeMatrix.get(0);
- for (; endOfData tumorTypeMetadatas = tumorTypes.values();
- // if user wants all, we're done
- if (tumorType.equals(Config.ALL)) {
- return tumorTypeMetadatas;
- }
-
- // iterate over all TumorTypeMetadata looking for match
- for (TumorTypeMetadata tumorTypeMetadata : tumorTypeMetadatas) {
- if (tumorTypeMetadata.getType().equals(tumorType)) {
- toReturn.add(tumorTypeMetadata);
- break;
- }
- }
-
- // outta here
- return toReturn;
- }
-
- /**
- * Function to get datatypes to download as String[]
- *
- * @param dataSourcesMetadata DataSourcesMetadata
- * @return String[]
- * @throws Exception
- */
- @Override
- public String[] getDatatypesToDownload(DataSourcesMetadata dataSourcesMetadata) throws Exception {
-
- HashSet toReturn = new HashSet();
- for (DatatypeMetadata datatypeMetadata : getDatatypeMetadata(Config.ALL)) {
- if (datatypeMetadata.isDownloaded()) {
- Method downloadArchivesMethod = datatypeMetadata.getDownloadArchivesMethod(dataSourcesMetadata.getDataSource());
- toReturn.addAll((Set) downloadArchivesMethod.invoke(datatypeMetadata, null));
- }
- }
-
- // outta here
- return toReturn.toArray(new String[0]);
- }
-
- /**
- * Function to determine the datatype(s) of the datasource file (the file
- * that was fetched from a datasource).
- *
- * @param dataSourcesMetadata DataSourcesMetadata
- * @param filename String
- * @return Collection
- * @throws Exception
- */
- @Override
- public Collection getFileDatatype(DataSourcesMetadata dataSourcesMetadata, String filename) throws Exception {
-
- Collection toReturn = new ArrayList();
- for (DatatypeMetadata datatypeMetadata : getDatatypeMetadata(Config.ALL)) {
- Method downloadArchivesMethod = datatypeMetadata.getDownloadArchivesMethod(dataSourcesMetadata.getDataSource());
- for (String archive : (Set) downloadArchivesMethod.invoke(datatypeMetadata, null)) {
- if (filename.contains(archive)) {
- toReturn.add(datatypeMetadata);
- }
- }
- }
-
- // outta here
- return toReturn;
- }
-
- /**
- * Gets a DatatypeMetadata object for the given datatype name. If datatype
- * == Config.ALL, all are returned.
- *
- * @param datatype String
- * @return Collection
- */
- @Override
- public Collection getDatatypeMetadata(String datatype) {
-
- Collection toReturn = new ArrayList();
-
- if (LOG.isInfoEnabled()) {
- LOG.info("getDatatypeMetadata(): " + datatype);
- }
-
- if (datatypesMatrix == null) {
- datatypesMatrix = getWorksheetData(gdataSpreadsheet, datatypesWorksheet);
- }
-
- Collection datatypeMetadatas
- = (Collection) getMetadataCollection(datatypesMatrix,
- "org.mskcc.cbio.importer.model.DatatypeMetadata");
- // if user wants all, we're done
- if (datatype.equals(Config.ALL)) {
- return datatypeMetadatas;
- }
-
- for (DatatypeMetadata datatypeMetadata : datatypeMetadatas) {
- if (datatypeMetadata.getDatatype().equals(datatype)) {
- toReturn.add(datatypeMetadata);
- break;
- }
- }
-
- // outta here
- return toReturn;
- }
-
- /**
- * Gets a collection of Datatype names for the given portal/cancer study.
- *
- * @param portalMetadata PortalMetadata
- * @param cancerStudyMetadata CancerStudyMetadata
- * @return Collection
- */
- @Override
- public Collection getDatatypeMetadata(PortalMetadata portalMetadata, CancerStudyMetadata cancerStudyMetadata) {
-
- Collection toReturn = new ArrayList();
-
- if (LOG.isInfoEnabled()) {
- LOG.info("getDatatypeMetadata(): " + portalMetadata.getName() + ":" + cancerStudyMetadata.toString());
- }
-
- if (cancerStudiesMatrix == null) {
- cancerStudiesMatrix = getWorksheetData(gdataSpreadsheet, cancerStudiesWorksheet);
- }
-
- // get portal-column index in the cancer studies worksheet
- int portalColumnIndex = cancerStudiesMatrix.get(0).indexOf(portalMetadata.getName());
- if (portalColumnIndex == -1) {
- return toReturn;
- }
-
- // iterate over all studies in worksheet and find row whose first element is cancer study (path)
- for (ArrayList matrixRow : cancerStudiesMatrix) {
- if (matrixRow.get(0).equals(cancerStudyMetadata.getStudyPath())) {
- // the datatypes for the portal/cancer_study is the value of the cell
- String datatypesIndicator = matrixRow.get(portalColumnIndex);
- if (datatypesIndicator.equalsIgnoreCase(CancerStudyMetadata.CANCER_STUDY_IN_PORTAL_INDICATOR)) {
- // all datatypes are desired
- toReturn = getDatatypeMetadata(Config.ALL);
- } else {
- // a delimited list of datatypes have been requested
- toReturn = new ArrayList();
- for (String datatype : datatypesIndicator.split(DatatypeMetadata.DATATYPES_DELIMITER)) {
- Collection metaData = getDatatypeMetadata(datatype);
- if (!metaData.isEmpty()) {
- DatatypeMetadata datatypeMetadata = metaData.iterator().next();
- toReturn.add(datatypeMetadata);
- if (LOG.isInfoEnabled()) {
- LOG.info("Selecting data type" + datatypeMetadata.getDatatype());
- }
- }
- }
- }
- break;
- }
- }
-
- // outta here
- return toReturn;
- }
-
- /**
- * Gets a collection of CaseIDFilterMetadata.
- *
- * @param filterName String
- * @return Collection
- */
- @Override
- public Collection getCaseIDFilterMetadata(String filterName) {
-
- Collection toReturn = new ArrayList();
-
- if (caseIDFiltersMatrix == null) {
- caseIDFiltersMatrix = getWorksheetData(gdataSpreadsheet, caseIDFiltersWorksheet);
- }
-
- Collection caseIDFilterMetadatas
- = (Collection) getMetadataCollection(caseIDFiltersMatrix,
- "org.mskcc.cbio.importer.model.CaseIDFilterMetadata");
-
- // if user wants all, we're done
- if (filterName.equals(Config.ALL)) {
- return caseIDFilterMetadatas;
- }
-
- for (CaseIDFilterMetadata caseIDFilterMetadata : caseIDFilterMetadatas) {
- if (caseIDFilterMetadata.getFilterName().equals(filterName)) {
- toReturn.add(caseIDFilterMetadata);
- break;
- }
- }
-
- // outta here
- return toReturn;
- }
-
- /**
- * Gets a collection of CaseListMetadata. If caseListFilename == Config.ALL,
- * all are returned.
- *
- * @param caseListFilename String
- * @return Collection
- */
- @Override
- public Collection getCaseListMetadata(String caseListFilename) {
-
- Collection toReturn = new ArrayList();
-
- if (caseListMatrix == null) {
- caseListMatrix = getWorksheetData(gdataSpreadsheet, caseListWorksheet);
- }
-
- Collection caseListMetadatas
- = (Collection) getMetadataCollection(caseListMatrix,
- "org.mskcc.cbio.importer.model.CaseListMetadata");
-
- // if user wants all, we're done
- if (caseListFilename.equals(Config.ALL)) {
- return caseListMetadatas;
- }
-
- for (CaseListMetadata caseListMetadata : caseListMetadatas) {
- if (caseListMetadata.getCaseListFilename().equals(caseListFilename)) {
- toReturn.add(caseListMetadata);
- break;
- }
- }
-
- // outta here
- return toReturn;
- }
-
- /**
- * Gets a collection of ClinicalAttributesNamespace. If
- * clinicalAttributeNamespaceColumnHeader == Config.ALL, all are returned.
- *
- * @param clinicalAttributesNamespaceColumnHeader String
- * @return Collection
- */
- @Override
- public Collection getClinicalAttributesNamespace(String clinicalAttributesNamespaceColumnHeader) {
-
- Collection toReturn = new ArrayList();
-
- if (clinicalAttributesNamespaceMatrix == null) {
- clinicalAttributesNamespaceMatrix = getWorksheetData(gdataSpreadsheet, clinicalAttributesNamespaceWorksheet);
- }
-
- Collection clinicalAttributesNamespace
- = (Collection) getMetadataCollection(clinicalAttributesNamespaceMatrix,
- "org.mskcc.cbio.importer.model.ClinicalAttributesNamespace");
-
- // if user wants all, we're done
- if (clinicalAttributesNamespaceColumnHeader.equals(Config.ALL)) {
- return clinicalAttributesNamespace;
- }
-
- for (ClinicalAttributesNamespace clinicalAttributesNamespaceEntry : clinicalAttributesNamespace) {
- if (clinicalAttributesNamespaceEntry.getExternalColumnHeader().equals(clinicalAttributesNamespaceColumnHeader)) {
- toReturn.add(clinicalAttributesNamespaceEntry);
- break;
- }
- }
-
- // outta here
- return toReturn;
- }
-
- /**
- * Gets a collection of ClinicalAttributesMetadata. If
- * clinicalAttributeColumnHeader == Config.ALL, all are returned.
- *
- * @param clinicalAttributesColumnHeader String
- * @return Collection
- */
- @Override
- public Collection getClinicalAttributesMetadata(String clinicalAttributesColumnHeader) {
-
- Collection toReturn = new ArrayList();
-
- if (clinicalAttributesMatrix == null) {
- clinicalAttributesMatrix = getWorksheetData(gdataSpreadsheet, clinicalAttributesWorksheet);
- }
-
- Collection clinicalAttributesMetadatas
- = (Collection) getMetadataCollection(clinicalAttributesMatrix,
- "org.mskcc.cbio.importer.model.ClinicalAttributesMetadata");
-
- // if user wants all, we're done
- if (clinicalAttributesColumnHeader.equals(Config.ALL)) {
- return clinicalAttributesMetadatas;
- }
-
- for (ClinicalAttributesMetadata clinicalAttributesMetadata : clinicalAttributesMetadatas) {
- if (clinicalAttributesMetadata.getNormalizedColumnHeader().equals(clinicalAttributesColumnHeader)) {
- toReturn.add(clinicalAttributesMetadata);
- break;
- }
- }
-
- // outta here
- return toReturn;
- }
-
- @Override
- public Map getClinicalAttributesMetadata(Collection externalColumnHeaders) {
- Map toReturn = new HashMap();
-
- HashMap clinicalAttributesNamespace = makeClinicalAttributesNamespaceHashMap();
- for (String externalColumnHeader : externalColumnHeaders) {
- if (clinicalAttributesNamespace.containsKey(externalColumnHeader)) {
- ClinicalAttributesNamespace namespace = clinicalAttributesNamespace.get(externalColumnHeader);
- if (!namespace.getNormalizedColumnHeader().isEmpty()) {
- Collection metadata = getClinicalAttributesMetadata(namespace.getNormalizedColumnHeader());
- if (metadata.size() == 1) {
- toReturn.put(externalColumnHeader, metadata.iterator().next());
- }
- }
- }
- }
- return toReturn;
- }
-
- @Override
- public void importBCRClinicalAttributes(Collection bcrs) {
-
- HashMap clinicalAttributesNamespace = makeClinicalAttributesNamespaceHashMap();
-
- for (BCRDictEntry bcr : bcrs) {
- if (!clinicalAttributesNamespace.containsKey(bcr.id)) {
- updateWorksheet(gdataSpreadsheet, clinicalAttributesNamespaceWorksheet,
- true, null, null,
- ClinicalAttributesNamespace.getPropertiesMap(bcr,
- ClinicalAttributesNamespace.DATE_FORMAT.format(Calendar.getInstance().getTime())));
- }
- }
- }
-
- @Override
- public void flagMissingClinicalAttributes(String cancerStudy, String tumorType, Collection missingAttributeColumnHeaders) {
- BCRDictEntry bcr = new BCRDictEntry();
- HashMap clinicalAttributesNamespace = makeClinicalAttributesNamespaceHashMap();
-
- boolean updatedClinicalAttributes = false;
- for (String missingAttribute : missingAttributeColumnHeaders) {
- String[] parts = missingAttribute.split(ClinicalAttributesNamespace.CDE_DELIM);
- if (!clinicalAttributesNamespace.containsKey(parts[0])) {
- NCIcaDSREntry entry = (parts.length == 2 && parts[1].length() > 0)
- ? nciDSRFetcher.fetchDSREntry(parts[1]) : null;
- bcr.id = parts[0];
- bcr.displayName = (entry == null) ? "" : entry.preferredName;
- bcr.description = (entry == null) ? "" : entry.preferredDefinition;
- bcr.tumorType = tumorType;
- bcr.cancerStudy = cancerStudy;
- updateWorksheet(gdataSpreadsheet, clinicalAttributesNamespaceWorksheet,
- true, null, null,
- ClinicalAttributesNamespace.getPropertiesMap(bcr,
- ClinicalAttributesNamespace.DATE_FORMAT.format(Calendar.getInstance().getTime())));
- updatedClinicalAttributes = true;
- } else {
- ClinicalAttributesNamespace ns = clinicalAttributesNamespace.get(parts[0]);
- if (!ns.getCancerStudy().contains(cancerStudy)) {
- bcr.id = ns.getExternalColumnHeader();
- bcr.displayName = ns.getDisplayName();
- bcr.description = ns.getDescription();
- bcr.tumorType = (ns.getTumorType().contains(tumorType)) ? ns.getTumorType() : ns.getTumorType() + "," + tumorType;
- bcr.cancerStudy = ns.getCancerStudy() + "," + cancerStudy;
- updateWorksheet(gdataSpreadsheet, clinicalAttributesNamespaceWorksheet,
- false, ClinicalAttributesNamespace.WORKSHEET_UPDATE_COLUMN_KEY,
- ns.getExternalColumnHeader(),
- ClinicalAttributesNamespace.getPropertiesMap(bcr,
- ClinicalAttributesNamespace.DATE_FORMAT.format(Calendar.getInstance().getTime())));
- updatedClinicalAttributes = true;
- }
- }
- }
- if (updatedClinicalAttributes) {
- clinicalAttributesNamespaceMatrix = null;
- }
- }
-
- private HashMap makeClinicalAttributesNamespaceHashMap() {
- HashMap toReturn = new HashMap();
- for (ClinicalAttributesNamespace clinicalAttributeNamespace : getClinicalAttributesNamespace(Config.ALL)) {
- toReturn.put(clinicalAttributeNamespace.getExternalColumnHeader(), clinicalAttributeNamespace);
- }
-
- return toReturn;
- }
-
- /**
- * Gets a PortalMetadata object given a portal name.
- *
- * @param portalName String
- * @return Collection
- */
- @Override
- public Collection getPortalMetadata(String portalName) {
-
- Collection toReturn = new ArrayList();
-
- if (portalsMatrix == null) {
- portalsMatrix = getWorksheetData(gdataSpreadsheet, portalsWorksheet);
- }
-
- Collection portalMetadatas
- = (Collection) getMetadataCollection(portalsMatrix,
- "org.mskcc.cbio.importer.model.PortalMetadata");
-
- // if user wants all, we're done
- if (portalName.equals(Config.ALL)) {
- return portalMetadatas;
- }
-
- for (PortalMetadata portalMetadata : portalMetadatas) {
- if (portalMetadata.getName().equals(portalName)) {
- toReturn.add(portalMetadata);
- break;
- }
- }
-
- // outta here
- return toReturn;
- }
-
- /**
- * Gets ReferenceMetadata for the given referenceType. If referenceType ==
- * Config.ALL, all are returned.
- *
- * @param referenceType String
- * @return Collection
- */
- @Override
- public Collection getReferenceMetadata(String referenceType) {
-
- Collection toReturn = new ArrayList();
-
- if (referenceMatrix == null) {
- referenceMatrix = getWorksheetData(gdataSpreadsheet, referenceDataWorksheet);
- }
-
- Collection referenceMetadatas
- = (Collection) getMetadataCollection(referenceMatrix,
- "org.mskcc.cbio.importer.model.ReferenceMetadata");
- // if user wants all, we're done
- if (referenceType.equals(Config.ALL)) {
- return referenceMetadatas;
- }
-
- // iterate over all ReferenceMetadata looking for match
- for (ReferenceMetadata referenceMetadata : referenceMetadatas) {
- if (referenceMetadata.getReferenceType().equals(referenceType)) {
- toReturn.add(referenceMetadata);
- break;
- }
- }
-
- // outta here
- return toReturn;
- }
-
- /**
- * Gets DataSourcesMetadata for the given datasource. If dataSource ==
- * Config.ALL, all are returned.
- *
- * @param dataSource String
- * @return Collection
- */
- @Override
- public Collection getDataSourcesMetadata(String dataSource) {
-
- Collection toReturn = new ArrayList();
-
- if (dataSourcesMatrix == null) {
- dataSourcesMatrix = getWorksheetData(gdataSpreadsheet, dataSourcesWorksheet);
- }
-
- Collection dataSourceMetadatas
- = (Collection) getMetadataCollection(dataSourcesMatrix,
- "org.mskcc.cbio.importer.model.DataSourcesMetadata");
- // if user wants all, we're done
- if (dataSource.equals(Config.ALL)) {
- return dataSourceMetadatas;
- }
-
- // iterate over all DataSourcesMetadata looking for match
- for (DataSourcesMetadata dataSourceMetadata : dataSourceMetadatas) {
- if (dataSourceMetadata.getDataSource().equals(dataSource)) {
- toReturn.add(dataSourceMetadata);
- break;
- }
- }
-
- // outta here
- return toReturn;
- }
-
- /**
- * Gets all the cancer studies for a given portal.
- *
- * @param portalName String
- * @return Collection
- */
- @Override
- public Collection getCancerStudyMetadata(String portalName) {
-
- Collection toReturn = new ArrayList();
-
- if (cancerStudiesMatrix == null) {
- cancerStudiesMatrix = getWorksheetData(gdataSpreadsheet, cancerStudiesWorksheet);
- }
-
- // get portal-column index in the cancer studies worksheet
- int portalColumnIndex = cancerStudiesMatrix.get(0).indexOf(portalName);
- if (portalColumnIndex == -1) {
- return toReturn;
- }
-
- // iterate over all studies in worksheet and determine if
- // the value at the row and portal/column intersection is not empty
- // (we start at one, because row 0 is the column headers)
- for (int lc = 1; lc < cancerStudiesMatrix.size(); lc++) {
- ArrayList matrixRow = cancerStudiesMatrix.get(lc);
- String datatypesIndicator = matrixRow.get(portalColumnIndex);
- if (datatypesIndicator != null && datatypesIndicator.length() > 0) {
- CancerStudyMetadata cancerStudyMetadata
- = new CancerStudyMetadata(matrixRow.toArray(new String[0]));
- // get tumor type metadata
- Collection tumorTypeCollection = getTumorTypeMetadata(cancerStudyMetadata.getTumorType());
- if (!tumorTypeCollection.isEmpty()) {
- cancerStudyMetadata.setTumorTypeMetadata(tumorTypeCollection.iterator().next());
- }
- // add to return set
- toReturn.add(cancerStudyMetadata);
- }
- }
-
- // outta here
- return toReturn;
- }
-
- /**
- * Gets a CancerStudyMetadata for the given cancer study.
- *
- * @param cancerStudyName String - fully qualified path as entered on worksheet,
- * e.g.: prad/mskcc/foundation
- * @return CancerStudyMetadata or null if not found
- */
- @Override
- public CancerStudyMetadata getCancerStudyMetadataByName(String cancerStudyName) {
-
- Collection cancerStudyMetadatas = getAllCancerStudyMetadata();
-
- for (CancerStudyMetadata cancerStudyMetadata : cancerStudyMetadatas) {
- if (cancerStudyMetadata.getStudyPath().equals(cancerStudyName)) {
- // get tumor type metadata
- Collection tumorTypeCollection = getTumorTypeMetadata(cancerStudyMetadata.getTumorType());
- if (!tumorTypeCollection.isEmpty()) {
- cancerStudyMetadata.setTumorTypeMetadata(tumorTypeCollection.iterator().next());
- }
- return cancerStudyMetadata;
- }
- }
-
- return null;
- }
-
- @Override
- public Collection getAllCancerStudyMetadata()
- {
- if (cancerStudiesMatrix == null) {
- cancerStudiesMatrix = getWorksheetData(gdataSpreadsheet, cancerStudiesWorksheet);
- }
-
- Collection cancerStudyMetadatas
- = (Collection) getMetadataCollection(cancerStudiesMatrix,
- "org.mskcc.cbio.importer.model.CancerStudyMetadata");
-
- return cancerStudyMetadatas;
- }
-
- /*
- return a Collection of IcgcMetadata objects derived from
- the ICGC worksheet on the google spreadsheet
- */
- @Override
- public Collection getIcgcMetadata() {
- if (icgcMatrix == null) {
- icgcMatrix = getWorksheetData(gdataSpreadsheet, icgcWorksheet);
- }
-
-
- Collection icgcMetadataCollection
- = (Collection) getMetadataCollection(icgcMatrix,
- "org.mskcc.cbio.importer.model.IcgcMetadata");
-
-
- return icgcMetadataCollection;
- }
-
- /**
- * Gets FoundationMetadata.
- *
- * @return Collection
- */
- @Override
- public Collection getFoundationMetadata() {
-
- if (foundationMatrix == null) {
- foundationMatrix = getWorksheetData(gdataSpreadsheet, foundationWorksheet);
- }
-
- Collection foundationMetadatas
- = (Collection) getMetadataCollection(foundationMatrix,
- "org.mskcc.cbio.importer.model.FoundationMetadata");
-
- // outta here
- return foundationMetadatas;
- }
-
- /**
- * public method to return a List of registered cancer studies by
- * organization name
- *
- * @param organizationName
- * @return List
- */
- @Override
- public List findCancerStudiesBySubstring(final String organizationName) {
- Preconditions.checkArgument(!Strings.isNullOrEmpty(organizationName),
- "An organization name is required");
-
- if (cancerStudiesMatrix == null) {
- cancerStudiesMatrix = getWorksheetData(gdataSpreadsheet, cancerStudiesWorksheet);
- }
-
- // column 0 contains the cancer study names
- List cancerStudyList = Lists.newArrayList();
- for (List study : cancerStudiesMatrix) {
- if (study.get(0).contains(organizationName.toLowerCase())) {
- cancerStudyList.add(study.get(0));
- }
- }
- return cancerStudyList;
-
- }
-
- @Override
- public void updateCancerStudyAttributes(String cancerStudy, Map properties)
- {
- updateWorksheet(gdataSpreadsheet, cancerStudiesWorksheet, false,
- CancerStudyMetadata.WORKSHEET_UPDATE_COLUMN_KEY,
- cancerStudy, properties);
- cancerStudiesMatrix = null;
- }
-
- @Override
- public void insertCancerStudyAttributes(Map properties)
- {
- updateWorksheet(gdataSpreadsheet, cancerStudiesWorksheet,
- true, null, null, properties);
- cancerStudiesMatrix = null;
- }
-
- /**
- * Constructs a collection of objects of the given classname from the given matrix.
- *
- * @param metadataMatrix ArrayList>
- * @param className String
- * @return Collection